Thread: Query tuning

Query tuning

From
"Subbiah, Stalin"
Date:
Hello All,

This query runs forever and ever. Nature of this table being lots of
inserts/deletes/query, I vacuum it every half hour to keep the holes
reusable and nightly once vacuum analyze to update the optimizer. We've
got index on eventtime only. Running it for current day uses index range
scan and it runs within acceptable time. Below is the explain of the
query. Is the order by sequencenum desc prevents from applying limit
optimization?

explain SELECT *
FROM EVENTLOG
WHERE EVENTTIME>'07/23/06 16:00:00'
AND  EVENTTIME<'08/22/06 16:00:00'
AND  (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA'
        OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA'
        OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA')
ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 0;

QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
-------------------------------------------------------------
 Limit  (cost=15546930.29..15546931.54 rows=500 width=327)
   ->  Sort  (cost=15546930.29..15581924.84 rows=13997819 width=327)
         Sort Key: eventtime, sequencenum
         ->  Seq Scan on eventlog  (cost=0.00..2332700.25 rows=13997819
width=327)
               Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp
without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp
without time zone) AND (((objdomainid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
(5 rows)

Thanks,
Stalin
Pg version 8.0.1, suse 64bit.

Re: Query tuning

From
Chris
Date:
Subbiah, Stalin wrote:
> Hello All,
>
> This query runs forever and ever. Nature of this table being lots of
> inserts/deletes/query, I vacuum it every half hour to keep the holes
> reusable and nightly once vacuum analyze to update the optimizer. We've
> got index on eventtime only. Running it for current day uses index range
> scan and it runs within acceptable time. Below is the explain of the
> query. Is the order by sequencenum desc prevents from applying limit
> optimization?
>
> explain SELECT *
> FROM EVENTLOG
> WHERE EVENTTIME>'07/23/06 16:00:00'
> AND  EVENTTIME<'08/22/06 16:00:00'
> AND  (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA'
>         OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA'
>         OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA')
> ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 0;
>
> QUERY PLAN
>
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> -------------------------------------------------------------
>  Limit  (cost=15546930.29..15546931.54 rows=500 width=327)
>    ->  Sort  (cost=15546930.29..15581924.84 rows=13997819 width=327)
>          Sort Key: eventtime, sequencenum
>          ->  Seq Scan on eventlog  (cost=0.00..2332700.25 rows=13997819
> width=327)
>                Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp
> without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp
> without time zone) AND (((objdomainid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
> (5 rows)
>
> Thanks,
> Stalin
> Pg version 8.0.1, suse 64bit.

Firstly you should update to 8.0.8 - because it's in the same stream you
won't need to do a dump/initdb/reload like a major version change, it
should be a simple upgrade.

Can you send explain analyze instead of just explain?

It sounds like you're not analyz'ing enough - if you're doing lots of
updates/deletes/inserts then the statistics postgresql uses to choose
whether to do an index scan or something else will quickly be outdated
and so it'll have to go back to a full table scan every time..

Can you set up autovacuum to handle that for you more regularly?

--
Postgresql & php tutorials
http://www.designmagick.com/

Re: Query tuning

From
"Subbiah, Stalin"
Date:
Actually these servers will be upgraded to 8.1.4 in couple of months.

Here you go with explain analyze.

# explain analyze SELECT *
FROM EVENTLOG
WHERE EVENTTIME>'07/23/06 16:00:00' AND  EVENTTIME<'08/22/06 16:00:00'
AND  (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA'
        OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA'
        OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA')
ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500;

QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
-------------------------------------------------------------
 Limit  (cost=15583110.14..15583111.39 rows=500 width=327) (actual
time=427771.568..427772.904 rows=500 loops=1)
   ->  Sort  (cost=15583108.89..15618188.88 rows=14031998 width=327)
(actual time=427770.504..427771.894 rows=1000 loops=1)
         Sort Key: eventtime, sequencenum
         ->  Seq Scan on eventlog  (cost=0.00..2334535.17 rows=14031998
width=327) (actual time=10.370..190038.764 rows=7699388 loops=1)
               Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp
without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp
without time zone) AND (((objdomainid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
 Total runtime: 437884.134 ms
(6 rows)

-----Original Message-----
From: Chris [mailto:dmagick@gmail.com]
Sent: Tuesday, August 22, 2006 6:37 PM
To: Subbiah, Stalin
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Query tuning

Subbiah, Stalin wrote:
> Hello All,
>
> This query runs forever and ever. Nature of this table being lots of
> inserts/deletes/query, I vacuum it every half hour to keep the holes
> reusable and nightly once vacuum analyze to update the optimizer.
> We've got index on eventtime only. Running it for current day uses
> index range scan and it runs within acceptable time. Below is the
> explain of the query. Is the order by sequencenum desc prevents from
> applying limit optimization?
>
> explain SELECT *
> FROM EVENTLOG
> WHERE EVENTTIME>'07/23/06 16:00:00'
> AND  EVENTTIME<'08/22/06 16:00:00'
> AND  (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA'
>         OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA'
>         OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA')
> ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 0;
>
> QUERY PLAN
>
> ----------------------------------------------------------------------
> --
> ----------------------------------------------------------------------
> --
> ----------------------------------------------------------------------
> --
> ----------------------------------------------------------------------
> --
> -------------------------------------------------------------
>  Limit  (cost=15546930.29..15546931.54 rows=500 width=327)
>    ->  Sort  (cost=15546930.29..15581924.84 rows=13997819 width=327)
>          Sort Key: eventtime, sequencenum
>          ->  Seq Scan on eventlog  (cost=0.00..2332700.25
> rows=13997819
> width=327)
>                Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp
> without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp
> without time zone) AND (((objdomainid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
> (5 rows)
>
> Thanks,
> Stalin
> Pg version 8.0.1, suse 64bit.

Firstly you should update to 8.0.8 - because it's in the same stream you
won't need to do a dump/initdb/reload like a major version change, it
should be a simple upgrade.

Can you send explain analyze instead of just explain?

It sounds like you're not analyz'ing enough - if you're doing lots of
updates/deletes/inserts then the statistics postgresql uses to choose
whether to do an index scan or something else will quickly be outdated
and so it'll have to go back to a full table scan every time..

Can you set up autovacuum to handle that for you more regularly?

--
Postgresql & php tutorials
http://www.designmagick.com/

Re: Query tuning

From
Chris
Date:
Subbiah, Stalin wrote:
> Actually these servers will be upgraded to 8.1.4 in couple of months.

even so, you could get some bad data in there.
http://www.postgresql.org/docs/8.0/static/release.html . Go through the
old release notes and you'll find various race conditions, crashes etc.

> Here you go with explain analyze.
>
> # explain analyze SELECT *
> FROM EVENTLOG
> WHERE EVENTTIME>'07/23/06 16:00:00' AND  EVENTTIME<'08/22/06 16:00:00'
> AND  (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA'
>         OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA'
>         OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA')
> ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500;
>
> QUERY PLAN
>
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> -------------------------------------------------------------
>  Limit  (cost=15583110.14..15583111.39 rows=500 width=327) (actual
> time=427771.568..427772.904 rows=500 loops=1)
>    ->  Sort  (cost=15583108.89..15618188.88 rows=14031998 width=327)
> (actual time=427770.504..427771.894 rows=1000 loops=1)
>          Sort Key: eventtime, sequencenum
>          ->  Seq Scan on eventlog  (cost=0.00..2334535.17 rows=14031998
> width=327) (actual time=10.370..190038.764 rows=7699388 loops=1)
>                Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp
> without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp
> without time zone) AND (((objdomainid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
>  Total runtime: 437884.134 ms
> (6 rows)

If you analyze the table then run this again what plan does it come back
with?

I can't read explain output properly but I suspect (and I'm sure I'll be
corrected if need be) that the sort step is way out of whack and so is
the seq scan because the stats aren't up to date enough.

Do you have an index on objdomainid, objid and userdomainid (one index
per field) ? I wonder if that will help much.

--
Postgresql & php tutorials
http://www.designmagick.com/

Re: Query tuning

From
"Subbiah, Stalin"
Date:
I get the same plan after running vacuum analyze. Nope, I don't have
index on objdomainid, objid and userdomainid. Only eventime has it.

-----Original Message-----
From: Chris [mailto:dmagick@gmail.com]
Sent: Tuesday, August 22, 2006 8:06 PM
To: Subbiah, Stalin
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Query tuning

Subbiah, Stalin wrote:
> Actually these servers will be upgraded to 8.1.4 in couple of months.

even so, you could get some bad data in there.
http://www.postgresql.org/docs/8.0/static/release.html . Go through the
old release notes and you'll find various race conditions, crashes etc.

> Here you go with explain analyze.
>
> # explain analyze SELECT *
> FROM EVENTLOG
> WHERE EVENTTIME>'07/23/06 16:00:00' AND  EVENTTIME<'08/22/06 16:00:00'

> AND  (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA'
>         OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA'
>         OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA')
> ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500;
>
> QUERY PLAN
>
> ----------------------------------------------------------------------
> --
> ----------------------------------------------------------------------
> --
> ----------------------------------------------------------------------
> --
> ----------------------------------------------------------------------
> --
> -------------------------------------------------------------
>  Limit  (cost=15583110.14..15583111.39 rows=500 width=327) (actual
> time=427771.568..427772.904 rows=500 loops=1)
>    ->  Sort  (cost=15583108.89..15618188.88 rows=14031998 width=327)
> (actual time=427770.504..427771.894 rows=1000 loops=1)
>          Sort Key: eventtime, sequencenum
>          ->  Seq Scan on eventlog  (cost=0.00..2334535.17
> rows=14031998
> width=327) (actual time=10.370..190038.764 rows=7699388 loops=1)
>                Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp
> without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp
> without time zone) AND (((objdomainid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
>  Total runtime: 437884.134 ms
> (6 rows)

If you analyze the table then run this again what plan does it come back
with?

I can't read explain output properly but I suspect (and I'm sure I'll be
corrected if need be) that the sort step is way out of whack and so is
the seq scan because the stats aren't up to date enough.

Do you have an index on objdomainid, objid and userdomainid (one index
per field) ? I wonder if that will help much.

--
Postgresql & php tutorials
http://www.designmagick.com/

Re: Query tuning

From
"Dave Dutcher"
Date:
It seems to me that what would work best is an index scan backward on the
eventtime index.  I don't see why that wouldn't work for you, maybe the
planner is just esitmating the seq scan and sort is faster for some reason.
What does EXPLAIN say if you use a small limit and offset like 10?  Or what
does EXPLAIN say if you first run "set enable_seqscan=false;"  (If you get
the same plan, then I wouldn't bother running EXPLAIN ANALYZE, but if you
get a different plan I would run EXPLAIN ANALYZE to see if the new plan is
any faster.)



> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of
> Subbiah, Stalin
> Sent: Wednesday, August 23, 2006 1:03 PM
> To: Chris
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Query tuning
>
>
> I get the same plan after running vacuum analyze. Nope, I don't have
> index on objdomainid, objid and userdomainid. Only eventime has it.
>
> -----Original Message-----
> From: Chris [mailto:dmagick@gmail.com]
> Sent: Tuesday, August 22, 2006 8:06 PM
> To: Subbiah, Stalin
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Query tuning
>
> Subbiah, Stalin wrote:
> > Actually these servers will be upgraded to 8.1.4 in couple
> of months.
>
> even so, you could get some bad data in there.
> http://www.postgresql.org/docs/8.0/static/release.html . Go
> through the
> old release notes and you'll find various race conditions,
> crashes etc.
>
> > Here you go with explain analyze.
> >
> > # explain analyze SELECT *
> > FROM EVENTLOG
> > WHERE EVENTTIME>'07/23/06 16:00:00' AND
> EVENTTIME<'08/22/06 16:00:00'
>
> > AND  (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA'
> >         OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA'
> >         OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA')
> > ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500;
> >
> > QUERY PLAN
> >
> >
> ----------------------------------------------------------------------
> > --
> >
> ----------------------------------------------------------------------
> > --
> >
> ----------------------------------------------------------------------
> > --
> >
> ----------------------------------------------------------------------
> > --
> > -------------------------------------------------------------
> >  Limit  (cost=15583110.14..15583111.39 rows=500 width=327) (actual
> > time=427771.568..427772.904 rows=500 loops=1)
> >    ->  Sort  (cost=15583108.89..15618188.88 rows=14031998
> width=327)
> > (actual time=427770.504..427771.894 rows=1000 loops=1)
> >          Sort Key: eventtime, sequencenum
> >          ->  Seq Scan on eventlog  (cost=0.00..2334535.17
> > rows=14031998
> > width=327) (actual time=10.370..190038.764 rows=7699388 loops=1)
> >                Filter: ((eventtime > '2006-07-23
> 16:00:00'::timestamp
> > without time zone) AND (eventtime < '2006-08-22
> 16:00:00'::timestamp
> > without time zone) AND (((objdomainid)::text =
> > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
> > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
> > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
> >  Total runtime: 437884.134 ms
> > (6 rows)
>
> If you analyze the table then run this again what plan does
> it come back
> with?
>
> I can't read explain output properly but I suspect (and I'm
> sure I'll be
> corrected if need be) that the sort step is way out of whack and so is
> the seq scan because the stats aren't up to date enough.
>
> Do you have an index on objdomainid, objid and userdomainid (one index
> per field) ? I wonder if that will help much.
>
> --
> Postgresql & php tutorials
> http://www.designmagick.com/
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>


Re: Query tuning

From
"Subbiah, Stalin"
Date:
Changing limit or offset to a small number doesn't have any change in
plans. Likewise enable_seqscan to false. They still take 8-10 mins to
runs.

-----Original Message-----
From: Dave Dutcher [mailto:dave@tridecap.com]
Sent: Wednesday, August 23, 2006 4:20 PM
To: Subbiah, Stalin
Cc: pgsql-performance@postgresql.org
Subject: RE: [PERFORM] Query tuning

It seems to me that what would work best is an index scan backward on
the eventtime index.  I don't see why that wouldn't work for you, maybe
the planner is just esitmating the seq scan and sort is faster for some
reason.
What does EXPLAIN say if you use a small limit and offset like 10?  Or
what does EXPLAIN say if you first run "set enable_seqscan=false;"  (If
you get the same plan, then I wouldn't bother running EXPLAIN ANALYZE,
but if you get a different plan I would run EXPLAIN ANALYZE to see if
the new plan is any faster.)



> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Subbiah,
> Stalin
> Sent: Wednesday, August 23, 2006 1:03 PM
> To: Chris
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Query tuning
>
>
> I get the same plan after running vacuum analyze. Nope, I don't have
> index on objdomainid, objid and userdomainid. Only eventime has it.
>
> -----Original Message-----
> From: Chris [mailto:dmagick@gmail.com]
> Sent: Tuesday, August 22, 2006 8:06 PM
> To: Subbiah, Stalin
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Query tuning
>
> Subbiah, Stalin wrote:
> > Actually these servers will be upgraded to 8.1.4 in couple
> of months.
>
> even so, you could get some bad data in there.
> http://www.postgresql.org/docs/8.0/static/release.html . Go through
> the old release notes and you'll find various race conditions, crashes

> etc.
>
> > Here you go with explain analyze.
> >
> > # explain analyze SELECT *
> > FROM EVENTLOG
> > WHERE EVENTTIME>'07/23/06 16:00:00' AND
> EVENTTIME<'08/22/06 16:00:00'
>
> > AND  (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA'
> >         OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA'
> >         OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA')
> > ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500;
> >
> > QUERY PLAN
> >
> >
> ----------------------------------------------------------------------
> > --
> >
> ----------------------------------------------------------------------
> > --
> >
> ----------------------------------------------------------------------
> > --
> >
> ----------------------------------------------------------------------
> > --
> > -------------------------------------------------------------
> >  Limit  (cost=15583110.14..15583111.39 rows=500 width=327) (actual
> > time=427771.568..427772.904 rows=500 loops=1)
> >    ->  Sort  (cost=15583108.89..15618188.88 rows=14031998
> width=327)
> > (actual time=427770.504..427771.894 rows=1000 loops=1)
> >          Sort Key: eventtime, sequencenum
> >          ->  Seq Scan on eventlog  (cost=0.00..2334535.17
> > rows=14031998
> > width=327) (actual time=10.370..190038.764 rows=7699388 loops=1)
> >                Filter: ((eventtime > '2006-07-23
> 16:00:00'::timestamp
> > without time zone) AND (eventtime < '2006-08-22
> 16:00:00'::timestamp
> > without time zone) AND (((objdomainid)::text =
> > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
> > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
> > 'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
> >  Total runtime: 437884.134 ms
> > (6 rows)
>
> If you analyze the table then run this again what plan does it come
> back with?
>
> I can't read explain output properly but I suspect (and I'm sure I'll
> be corrected if need be) that the sort step is way out of whack and so

> is the seq scan because the stats aren't up to date enough.
>
> Do you have an index on objdomainid, objid and userdomainid (one index

> per field) ? I wonder if that will help much.
>
> --
> Postgresql & php tutorials
> http://www.designmagick.com/
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>