Thread: index not used afer VACUUM ANALYZE
Hi, I hav a db as specified in nit.sql flows has 763488 entries. After dropping/creating/loading the db and running auswert.sh I get the attached result from query1.txt. After 'VACUUM ANALYZE' I get the results from query2.txt As you can see, the indexes are not used any longer. Why? Bye/2 --- Michael Reifenberger, Business Unit Manager SAP-Basis, Plaut Consulting Comp: Michael.Reifenberger@plaut.de | Priv: Michael@Reifenberger.com http://www.plaut.de | http://www.Reifenberger.com
On Thu, 21 Aug 2003, mike wrote: > Hi, > I hav a db as specified in nit.sql > flows has 763488 entries. > > After dropping/creating/loading the db and running auswert.sh I get > the attached result from query1.txt. > After 'VACUUM ANALYZE' I get the results from query2.txt > > As you can see, the indexes are not used any longer. > Why? It looks like the row estimates changed to say that a large % of the rows match the condition. Is that true? In any case, what does EXPLAIN ANALYZE rather than EXPLAIN say for the various queries?
On Mon, 25 Aug 2003, mike wrote: > On Mon, 25 Aug 2003, Stephan Szabo wrote: > > > Date: Mon, 25 Aug 2003 00:43:56 -0700 (PDT) > > From: Stephan Szabo <sszabo@megazone.bigpanda.com> > > To: mike <mike@Reifenberger.com> > > Cc: pgsql-bugs@postgresql.org > > Subject: Re: [BUGS] index not used afer VACUUM ANALYZE > > > > On Thu, 21 Aug 2003, mike wrote: > > > > > Hi, > > > I hav a db as specified in nit.sql > > > flows has 763488 entries. > > > > > > After dropping/creating/loading the db and running auswert.sh I get > > > the attached result from query1.txt. > > > After 'VACUUM ANALYZE' I get the results from query2.txt > > > > > > As you can see, the indexes are not used any longer. > > > Why? > > > > It looks like the row estimates changed to say that a large % of the rows > > match the condition. Is that true? In any case, what does EXPLAIN > > Partially. > I have statistical records (763488) - various IP-Traffic - collected for one > month. > After collection I try to condense the data for dayly statistics. > > The EXPLAIN ANALYZE output is attached: > a1.txt is before, a2.txt after VACUUM ANALYZE run. There are two things that jump out at me, the first is that the group aggregate estimates on the after are way higher than reality and that it looks to me that the sort before the group aggregate is taking longer than expected. What do you have sort_mem set to since that will affect whether sorts are in memory and I believe whether it thinks it can use a hash aggregate on that nubmer of rows.
On Mon, 25 Aug 2003, Stephan Szabo wrote: > Date: Mon, 25 Aug 2003 00:43:56 -0700 (PDT) > From: Stephan Szabo <sszabo@megazone.bigpanda.com> > To: mike <mike@Reifenberger.com> > Cc: pgsql-bugs@postgresql.org > Subject: Re: [BUGS] index not used afer VACUUM ANALYZE > > On Thu, 21 Aug 2003, mike wrote: > > > Hi, > > I hav a db as specified in nit.sql > > flows has 763488 entries. > > > > After dropping/creating/loading the db and running auswert.sh I get > > the attached result from query1.txt. > > After 'VACUUM ANALYZE' I get the results from query2.txt > > > > As you can see, the indexes are not used any longer. > > Why? > > It looks like the row estimates changed to say that a large % of the rows > match the condition. Is that true? In any case, what does EXPLAIN Partially. I have statistical records (763488) - various IP-Traffic - collected for one month. After collection I try to condense the data for dayly statistics. The EXPLAIN ANALYZE output is attached: a1.txt is before, a2.txt after VACUUM ANALYZE run. Bye/2 --- Michael Reifenberger, Business Unit Manager SAP-Basis, Plaut Consulting Comp: Michael.Reifenberger@plaut.de | Priv: Michael@Reifenberger.com http://www.plaut.de | http://www.Reifenberger.com
On Tue, 26 Aug 2003, mike wrote: > On Mon, 25 Aug 2003, Stephan Szabo wrote: > > > Date: Mon, 25 Aug 2003 08:52:34 -0700 (PDT) > > From: Stephan Szabo <sszabo@megazone.bigpanda.com> > > To: mike <mike@Reifenberger.com> > > Cc: pgsql-bugs@postgresql.org > > Subject: Re: [BUGS] index not used afer VACUUM ANALYZE > > > > On Mon, 25 Aug 2003, mike wrote: > > > > > On Mon, 25 Aug 2003, Stephan Szabo wrote: > > > > > > > Date: Mon, 25 Aug 2003 00:43:56 -0700 (PDT) > > > > From: Stephan Szabo <sszabo@megazone.bigpanda.com> > > > > To: mike <mike@Reifenberger.com> > > > > Cc: pgsql-bugs@postgresql.org > > > > Subject: Re: [BUGS] index not used afer VACUUM ANALYZE > > > > > > > > On Thu, 21 Aug 2003, mike wrote: > > > > > > > > > Hi, > > > > > I hav a db as specified in nit.sql > > > > > flows has 763488 entries. > > > > > > > > > > After dropping/creating/loading the db and running auswert.sh I get > > > > > the attached result from query1.txt. > > > > > After 'VACUUM ANALYZE' I get the results from query2.txt > > > > > > > > > > As you can see, the indexes are not used any longer. > > > > > Why? > > > > > > > > It looks like the row estimates changed to say that a large % of the rows > > > > match the condition. Is that true? In any case, what does EXPLAIN > > > > > > Partially. > > > I have statistical records (763488) - various IP-Traffic - collected for one > > > month. > > > After collection I try to condense the data for dayly statistics. > > > > > > The EXPLAIN ANALYZE output is attached: > > > a1.txt is before, a2.txt after VACUUM ANALYZE run. > > > > There are two things that jump out at me, the first is that the group > > aggregate estimates on the after are way higher than reality and that it > > looks to me that the sort before the group aggregate is taking longer than > > expected. What do you have sort_mem set to since that will affect whether > > sorts are in memory and I believe whether it thinks it can use a hash > > aggregate on that nubmer of rows. > > > > sort_men was at the default. > But setting it to 10240 doesn't seem to change the seqscan on flows. But does it change the amount of time the query actually takes to run? seqscans are not always slower nor are they necessarily the actual problem here. The problem seems to be choosing a group aggregate + sort which is taking alot of time, if you look at the real time on the steps below that it's approximately the same for seqscan or index scan.
On Mon, 25 Aug 2003, Stephan Szabo wrote: > Date: Mon, 25 Aug 2003 08:52:34 -0700 (PDT) > From: Stephan Szabo <sszabo@megazone.bigpanda.com> > To: mike <mike@Reifenberger.com> > Cc: pgsql-bugs@postgresql.org > Subject: Re: [BUGS] index not used afer VACUUM ANALYZE > > On Mon, 25 Aug 2003, mike wrote: > > > On Mon, 25 Aug 2003, Stephan Szabo wrote: > > > > > Date: Mon, 25 Aug 2003 00:43:56 -0700 (PDT) > > > From: Stephan Szabo <sszabo@megazone.bigpanda.com> > > > To: mike <mike@Reifenberger.com> > > > Cc: pgsql-bugs@postgresql.org > > > Subject: Re: [BUGS] index not used afer VACUUM ANALYZE > > > > > > On Thu, 21 Aug 2003, mike wrote: > > > > > > > Hi, > > > > I hav a db as specified in nit.sql > > > > flows has 763488 entries. > > > > > > > > After dropping/creating/loading the db and running auswert.sh I get > > > > the attached result from query1.txt. > > > > After 'VACUUM ANALYZE' I get the results from query2.txt > > > > > > > > As you can see, the indexes are not used any longer. > > > > Why? > > > > > > It looks like the row estimates changed to say that a large % of the rows > > > match the condition. Is that true? In any case, what does EXPLAIN > > > > Partially. > > I have statistical records (763488) - various IP-Traffic - collected for one > > month. > > After collection I try to condense the data for dayly statistics. > > > > The EXPLAIN ANALYZE output is attached: > > a1.txt is before, a2.txt after VACUUM ANALYZE run. > > There are two things that jump out at me, the first is that the group > aggregate estimates on the after are way higher than reality and that it > looks to me that the sort before the group aggregate is taking longer than > expected. What do you have sort_mem set to since that will affect whether > sorts are in memory and I believe whether it thinks it can use a hash > aggregate on that nubmer of rows. > sort_men was at the default. But setting it to 10240 doesn't seem to change the seqscan on flows. Bye/2 --- Michael Reifenberger, Business Unit Manager SAP-Basis, Plaut Consulting Comp: Michael.Reifenberger@plaut.de | Priv: Michael@Reifenberger.com http://www.plaut.de | http://www.Reifenberger.com
On Wed, 27 Aug 2003, mike wrote: > Hi, > On Tue, 26 Aug 2003, Stephan Szabo wrote: > ... > > But does it change the amount of time the query actually takes to run? > > seqscans are not always slower nor are they necessarily the actual problem > > here. The problem seems to be choosing a group aggregate + sort which is > > taking alot of time, if you look at the real time on the steps below that > > it's approximately the same for seqscan or index scan. > > Ok, with plenty of sort_mem (327680) the seqscan seems to be faster. Eek. That's really big. Was that the swap over point, or was that just an arbitrary large value. Basically it looks like the difference really is whether it uses the hash aggregate on the first and third query or not. The question is figuring out at what point that switches over, and how bad it is to use that much ram for this query (btw: what does say top/free/ps say while the query is running for large sort_mem vs small sort_mem)?
Hi, On Tue, 26 Aug 2003, Stephan Szabo wrote: ... > But does it change the amount of time the query actually takes to run? > seqscans are not always slower nor are they necessarily the actual problem > here. The problem seems to be choosing a group aggregate + sort which is > taking alot of time, if you look at the real time on the steps below that > it's approximately the same for seqscan or index scan. Ok, with plenty of sort_mem (327680) the seqscan seems to be faster. Using 64000 shared_buffers it's not that amazing since nothing hits the disks. As last times aus1.txt is before aus2.txt after the VACUUM ANALYZE. Bye/2 --- Michael Reifenberger, Business Unit Manager SAP-Basis, Plaut Consulting Comp: Michael.Reifenberger@plaut.de | Priv: Michael@Reifenberger.com http://www.plaut.de | http://www.Reifenberger.com