Thread: index not used afer VACUUM ANALYZE

index not used afer VACUUM ANALYZE

From

mike

Date:

24 August 2003, 17:01:36

Hi,
I hav a db as specified in nit.sql
flows has 763488 entries.

After dropping/creating/loading the db and running auswert.sh I get
the attached result from query1.txt.
After 'VACUUM ANALYZE' I get the results from query2.txt

As you can see, the indexes are not used any longer.
Why?

Bye/2
---
Michael Reifenberger, Business Unit Manager SAP-Basis, Plaut Consulting
Comp: Michael.Reifenberger@plaut.de | Priv: Michael@Reifenberger.com
      http://www.plaut.de           |       http://www.Reifenberger.com

Re: index not used afer VACUUM ANALYZE

From

Stephan Szabo

Date:

25 August 2003, 04:44:45

On Thu, 21 Aug 2003, mike wrote:

> Hi,
> I hav a db as specified in nit.sql
> flows has 763488 entries.
>
> After dropping/creating/loading the db and running auswert.sh I get
> the attached result from query1.txt.
> After 'VACUUM ANALYZE' I get the results from query2.txt
>
> As you can see, the indexes are not used any longer.
> Why?

It looks like the row estimates changed to say that a large % of the rows
match the condition.  Is that true?  In any case, what does EXPLAIN
ANALYZE rather than EXPLAIN say for the various queries?

Re: index not used afer VACUUM ANALYZE

From

Stephan Szabo

Date:

25 August 2003, 12:53:41

On Mon, 25 Aug 2003, mike wrote:

> On Mon, 25 Aug 2003, Stephan Szabo wrote:
>
> > Date: Mon, 25 Aug 2003 00:43:56 -0700 (PDT)
> > From: Stephan Szabo <sszabo@megazone.bigpanda.com>
> > To: mike <mike@Reifenberger.com>
> > Cc: pgsql-bugs@postgresql.org
> > Subject: Re: [BUGS] index not used afer VACUUM ANALYZE
> >
> > On Thu, 21 Aug 2003, mike wrote:
> >
> > > Hi,
> > > I hav a db as specified in nit.sql
> > > flows has 763488 entries.
> > >
> > > After dropping/creating/loading the db and running auswert.sh I get
> > > the attached result from query1.txt.
> > > After 'VACUUM ANALYZE' I get the results from query2.txt
> > >
> > > As you can see, the indexes are not used any longer.
> > > Why?
> >
> > It looks like the row estimates changed to say that a large % of the rows
> > match the condition.  Is that true?  In any case, what does EXPLAIN
>
> Partially.
> I have statistical records (763488) - various IP-Traffic - collected for one
> month.
> After collection I try to condense the data for dayly statistics.
>
> The EXPLAIN ANALYZE output is attached:
> a1.txt is before, a2.txt after VACUUM ANALYZE run.

There are two things that jump out at me, the first is that the group
aggregate estimates on the after are way higher than reality and that it
looks to me that the sort before the group aggregate is taking longer than
expected.  What do you have sort_mem set to since that will affect whether
sorts are in memory and I believe whether it thinks it can use a hash
aggregate on that nubmer of rows.

Re: index not used afer VACUUM ANALYZE

From

mike

Date:

25 August 2003, 15:53:23

On Mon, 25 Aug 2003, Stephan Szabo wrote:

> Date: Mon, 25 Aug 2003 00:43:56 -0700 (PDT)
> From: Stephan Szabo <sszabo@megazone.bigpanda.com>
> To: mike <mike@Reifenberger.com>
> Cc: pgsql-bugs@postgresql.org
> Subject: Re: [BUGS] index not used afer VACUUM ANALYZE
>
> On Thu, 21 Aug 2003, mike wrote:
>
> > Hi,
> > I hav a db as specified in nit.sql
> > flows has 763488 entries.
> >
> > After dropping/creating/loading the db and running auswert.sh I get
> > the attached result from query1.txt.
> > After 'VACUUM ANALYZE' I get the results from query2.txt
> >
> > As you can see, the indexes are not used any longer.
> > Why?
>
> It looks like the row estimates changed to say that a large % of the rows
> match the condition.  Is that true?  In any case, what does EXPLAIN

Partially.
I have statistical records (763488) - various IP-Traffic - collected for one
month.
After collection I try to condense the data for dayly statistics.

The EXPLAIN ANALYZE output is attached:
a1.txt is before, a2.txt after VACUUM ANALYZE run.



Bye/2
---
Michael Reifenberger, Business Unit Manager SAP-Basis, Plaut Consulting
Comp: Michael.Reifenberger@plaut.de | Priv: Michael@Reifenberger.com
      http://www.plaut.de           |       http://www.Reifenberger.com

Re: index not used afer VACUUM ANALYZE

From

Stephan Szabo

Date:

26 August 2003, 17:59:36

On Tue, 26 Aug 2003, mike wrote:

> On Mon, 25 Aug 2003, Stephan Szabo wrote:
>
> > Date: Mon, 25 Aug 2003 08:52:34 -0700 (PDT)
> > From: Stephan Szabo <sszabo@megazone.bigpanda.com>
> > To: mike <mike@Reifenberger.com>
> > Cc: pgsql-bugs@postgresql.org
> > Subject: Re: [BUGS] index not used afer VACUUM ANALYZE
> >
> > On Mon, 25 Aug 2003, mike wrote:
> >
> > > On Mon, 25 Aug 2003, Stephan Szabo wrote:
> > >
> > > > Date: Mon, 25 Aug 2003 00:43:56 -0700 (PDT)
> > > > From: Stephan Szabo <sszabo@megazone.bigpanda.com>
> > > > To: mike <mike@Reifenberger.com>
> > > > Cc: pgsql-bugs@postgresql.org
> > > > Subject: Re: [BUGS] index not used afer VACUUM ANALYZE
> > > >
> > > > On Thu, 21 Aug 2003, mike wrote:
> > > >
> > > > > Hi,
> > > > > I hav a db as specified in nit.sql
> > > > > flows has 763488 entries.
> > > > >
> > > > > After dropping/creating/loading the db and running auswert.sh I get
> > > > > the attached result from query1.txt.
> > > > > After 'VACUUM ANALYZE' I get the results from query2.txt
> > > > >
> > > > > As you can see, the indexes are not used any longer.
> > > > > Why?
> > > >
> > > > It looks like the row estimates changed to say that a large % of the rows
> > > > match the condition.  Is that true?  In any case, what does EXPLAIN
> > >
> > > Partially.
> > > I have statistical records (763488) - various IP-Traffic - collected for one
> > > month.
> > > After collection I try to condense the data for dayly statistics.
> > >
> > > The EXPLAIN ANALYZE output is attached:
> > > a1.txt is before, a2.txt after VACUUM ANALYZE run.
> >
> > There are two things that jump out at me, the first is that the group
> > aggregate estimates on the after are way higher than reality and that it
> > looks to me that the sort before the group aggregate is taking longer than
> > expected.  What do you have sort_mem set to since that will affect whether
> > sorts are in memory and I believe whether it thinks it can use a hash
> > aggregate on that nubmer of rows.
> >
>
> sort_men was at the default.
> But setting it to 10240 doesn't seem to change the seqscan on flows.

But does it change the amount of time the query actually takes to run?
seqscans are not always slower nor are they necessarily the actual problem
here.  The problem seems to be choosing a group aggregate + sort which is
taking alot of time, if you look at the real time on the steps below that
it's approximately the same for seqscan or index scan.

Re: index not used afer VACUUM ANALYZE

From

mike

Date:

26 August 2003, 21:26:30

On Mon, 25 Aug 2003, Stephan Szabo wrote:

> Date: Mon, 25 Aug 2003 08:52:34 -0700 (PDT)
> From: Stephan Szabo <sszabo@megazone.bigpanda.com>
> To: mike <mike@Reifenberger.com>
> Cc: pgsql-bugs@postgresql.org
> Subject: Re: [BUGS] index not used afer VACUUM ANALYZE
>
> On Mon, 25 Aug 2003, mike wrote:
>
> > On Mon, 25 Aug 2003, Stephan Szabo wrote:
> >
> > > Date: Mon, 25 Aug 2003 00:43:56 -0700 (PDT)
> > > From: Stephan Szabo <sszabo@megazone.bigpanda.com>
> > > To: mike <mike@Reifenberger.com>
> > > Cc: pgsql-bugs@postgresql.org
> > > Subject: Re: [BUGS] index not used afer VACUUM ANALYZE
> > >
> > > On Thu, 21 Aug 2003, mike wrote:
> > >
> > > > Hi,
> > > > I hav a db as specified in nit.sql
> > > > flows has 763488 entries.
> > > >
> > > > After dropping/creating/loading the db and running auswert.sh I get
> > > > the attached result from query1.txt.
> > > > After 'VACUUM ANALYZE' I get the results from query2.txt
> > > >
> > > > As you can see, the indexes are not used any longer.
> > > > Why?
> > >
> > > It looks like the row estimates changed to say that a large % of the rows
> > > match the condition.  Is that true?  In any case, what does EXPLAIN
> >
> > Partially.
> > I have statistical records (763488) - various IP-Traffic - collected for one
> > month.
> > After collection I try to condense the data for dayly statistics.
> >
> > The EXPLAIN ANALYZE output is attached:
> > a1.txt is before, a2.txt after VACUUM ANALYZE run.
>
> There are two things that jump out at me, the first is that the group
> aggregate estimates on the after are way higher than reality and that it
> looks to me that the sort before the group aggregate is taking longer than
> expected.  What do you have sort_mem set to since that will affect whether
> sorts are in memory and I believe whether it thinks it can use a hash
> aggregate on that nubmer of rows.
>

sort_men was at the default.
But setting it to 10240 doesn't seem to change the seqscan on flows.


Bye/2
---
Michael Reifenberger, Business Unit Manager SAP-Basis, Plaut Consulting
Comp: Michael.Reifenberger@plaut.de | Priv: Michael@Reifenberger.com
      http://www.plaut.de           |       http://www.Reifenberger.com

Re: index not used afer VACUUM ANALYZE

From

Stephan Szabo

Date:

27 August 2003, 22:01:03

On Wed, 27 Aug 2003, mike wrote:

> Hi,
> On Tue, 26 Aug 2003, Stephan Szabo wrote:
> ...
> > But does it change the amount of time the query actually takes to run?
> > seqscans are not always slower nor are they necessarily the actual problem
> > here.  The problem seems to be choosing a group aggregate + sort which is
> > taking alot of time, if you look at the real time on the steps below that
> > it's approximately the same for seqscan or index scan.
>
> Ok, with plenty of sort_mem (327680) the seqscan seems to be faster.

Eek.  That's really big.  Was that the swap over point, or was that just
an arbitrary large value.  Basically it looks like the difference really
is whether it uses the hash aggregate on the first and third query or not.
The question is figuring out at what point that switches over, and how bad
it is to use that much ram for this query (btw: what does say top/free/ps
say while the query is running for large sort_mem vs small sort_mem)?

Re: index not used afer VACUUM ANALYZE

From

mike

Date:

28 August 2003, 01:28:36

Hi,
On Tue, 26 Aug 2003, Stephan Szabo wrote:
...
> But does it change the amount of time the query actually takes to run?
> seqscans are not always slower nor are they necessarily the actual problem
> here.  The problem seems to be choosing a group aggregate + sort which is
> taking alot of time, if you look at the real time on the steps below that
> it's approximately the same for seqscan or index scan.

Ok, with plenty of sort_mem (327680) the seqscan seems to be faster.
Using 64000 shared_buffers it's not that amazing since nothing hits the disks.
As last times aus1.txt is before aus2.txt after the VACUUM ANALYZE.


Bye/2
---
Michael Reifenberger, Business Unit Manager SAP-Basis, Plaut Consulting
Comp: Michael.Reifenberger@plaut.de | Priv: Michael@Reifenberger.com
      http://www.plaut.de           |       http://www.Reifenberger.com