Thread: Surprise :-(
I ran a vacuum analyze on a database. Now the query I ran vacuum analyze for takes twice as long, and all the other queries I tested take longer, too. Please help. Thank you all.
On Thu, 5 Sep 2002, Mihai Gheorghiu wrote: > I ran a vacuum analyze on a database. Now the query I ran vacuum analyze for > takes twice as long, and all the other queries I tested take longer, too. > Please help. What are the queries and explain output for the queries (preferably including the old state if you have explain from that as well).
explain select account, sum(amount) from tbas_transactions where isposted and trxtype = 'MP' group by account; psql:xx.txt:1: NOTICE: QUERY PLAN: Aggregate (cost=10874.64..10889.76 rows=302 width=24) -> Group (cost=10874.64..10882.20 rows=3025 width=24) -> Sort (cost=10874.64..10874.64 rows=3025 width=24) -> Index Scan using trx_trxtype_idx on tbas_transactions (cost=0.00..10699.78 rows=3025 width=24) EXPLAIN Sorry, I do not have an explain from before vacuum analyze. The table has ~700k rows and indices on account, trxtype and a few other fields used in other queries. -----Original Message----- From: Stephan Szabo <sszabo@megazone23.bigpanda.com> To: Mihai Gheorghiu <tanethq@earthlink.net> Cc: pgsql-general@postgresql.org <pgsql-general@postgresql.org> Date: Thursday, September 05, 2002 4:34 PM Subject: Re: [GENERAL] Surprise :-( >On Thu, 5 Sep 2002, Mihai Gheorghiu wrote: > >> I ran a vacuum analyze on a database. Now the query I ran vacuum analyze for >> takes twice as long, and all the other queries I tested take longer, too. >> Please help. > >What are the queries and explain output for the queries (preferably >including the old state if you have explain from that as well). >
On Thu, 5 Sep 2002, Mihai Gheorghiu wrote: > > explain select account, sum(amount) from tbas_transactions where isposted > and trxtype = 'MP' group by account; > psql:xx.txt:1: NOTICE: QUERY PLAN: > > Aggregate (cost=10874.64..10889.76 rows=302 width=24) > -> Group (cost=10874.64..10882.20 rows=3025 width=24) > -> Sort (cost=10874.64..10874.64 rows=3025 width=24) > -> Index Scan using trx_trxtype_idx on tbas_transactions > (cost=0.00..10699.78 rows=3025 width=24) > > EXPLAIN > > Sorry, I do not have an explain from before vacuum analyze. > The table has ~700k rows and indices on account, trxtype and a few other > fields used in other queries. If you set enable_indexscan=off; and then run explain/run the query is it better? For other questions, how many rows have trxtype='MP' really and what version are you running?
On Thu, 5 Sep 2002, Mihai Gheorghiu wrote: > > explain select account, sum(amount) from tbas_transactions where isposted > and trxtype = 'MP' group by account; > psql:xx.txt:1: NOTICE: QUERY PLAN: > > Aggregate (cost=10874.64..10889.76 rows=302 width=24) > -> Group (cost=10874.64..10882.20 rows=3025 width=24) > -> Sort (cost=10874.64..10874.64 rows=3025 width=24) > -> Index Scan using trx_trxtype_idx on tbas_transactions > (cost=0.00..10699.78 rows=3025 width=24) > > EXPLAIN > > Sorry, I do not have an explain from before vacuum analyze. > The table has ~700k rows and indices on account, trxtype and a few other > fields used in other queries. > > > -----Original Message----- > From: Stephan Szabo <sszabo@megazone23.bigpanda.com> > To: Mihai Gheorghiu <tanethq@earthlink.net> > Cc: pgsql-general@postgresql.org <pgsql-general@postgresql.org> > Date: Thursday, September 05, 2002 4:34 PM > Subject: Re: [GENERAL] Surprise :-( > > > >On Thu, 5 Sep 2002, Mihai Gheorghiu wrote: > > > >> I ran a vacuum analyze on a database. Now the query I ran vacuum analyze > for > >> takes twice as long, and all the other queries I tested take longer, too. > >> Please help. > > > >What are the queries and explain output for the queries (preferably > >including the old state if you have explain from that as well). > > Okay, so, on the face of it it's a pretty good plan, it has chosen an index scan returning only 3025 tuples out of 7000,000 after all. What does EXPLAIN ANALYZE <your query> show? Also you could try and replicate the previous plan by doing a SET ENABLE_INDEXSCAN = OFF. Although this is a unlikely to give the previous plan imo. A better approach would be to drop the index currently used, once sure of being able to recreate it of course, and repeating as necessary to see which index was giving the previous level performance. What does SELECT * FROM pg_statistic WHERE starelid = (SELECT oid FROM pg_class WHERE relname = 'tbas_transactions'); give? Don't forget to indicate which column of your table has which attribute number in that output. Assuming the name of the index used is descriptive the difference must be due to isposted being true being more selective than the trxtype test. It starts to get difficult at this point without knowing the explain analyze results and something like SELECT isposted, count(1) FROM tbas_transactions GROUP BY isposted. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
PG7.1.3 on RH7.1 on Dell PowerEdge 2500SC P3-933, 1GB RAM, 18GB SCSI160 There are 350k rows with trxtype=MP With indexscan=off: Aggregate (cost=22975.15..22990.27 rows=302 width=24) -> Group (cost=22975.15..22982.71 rows=3025 width=24) -> Sort (cost=22975.15..22975.15 rows=3025 width=24) -> Seq Scan on tbas_transactions (cost=0.00..22800.29 rows=3025 width=24) Time: 25.9s With indexscan=on: Aggregate (cost=10874.64..10889.76 rows=302 width=24) -> Group (cost=10874.64..10882.20 rows=3025 width=24) -> Sort (cost=10874.64..10874.64 rows=3025 width=24) -> Index Scan using trx_trxtype_idx on tbas_transactions (cost=0.00..10699.78 rows=3025 width=24) Time: 24.9s The point is I need to run this query in a fraction of the above time, otherwise I'm in deep trouble. Any suggestion is welcome. -----Original Message----- From: Stephan Szabo <sszabo@megazone23.bigpanda.com> To: Mihai Gheorghiu <tanethq@earthlink.net> Cc: pgsql-general@postgresql.org <pgsql-general@postgresql.org> Date: Thursday, September 05, 2002 6:51 PM Subject: Re: [GENERAL] Surprise :-( > >On Thu, 5 Sep 2002, Mihai Gheorghiu wrote: > >> >> explain select account, sum(amount) from tbas_transactions where isposted >> and trxtype = 'MP' group by account; >> psql:xx.txt:1: NOTICE: QUERY PLAN: >> >> Aggregate (cost=10874.64..10889.76 rows=302 width=24) >> -> Group (cost=10874.64..10882.20 rows=3025 width=24) >> -> Sort (cost=10874.64..10874.64 rows=3025 width=24) >> -> Index Scan using trx_trxtype_idx on tbas_transactions >> (cost=0.00..10699.78 rows=3025 width=24) >> >> EXPLAIN >> >> Sorry, I do not have an explain from before vacuum analyze. >> The table has ~700k rows and indices on account, trxtype and a few other >> fields used in other queries. > >If you set enable_indexscan=off; and then run explain/run the query >is it better? > >For other questions, how many rows have trxtype='MP' really and what >version are you running? > >
On Fri, 6 Sep 2002, Mihai Gheorghiu wrote: > PG7.1.3 on RH7.1 on Dell PowerEdge 2500SC P3-933, 1GB RAM, 18GB SCSI160 > There are 350k rows with trxtype=MP > With indexscan=off: > Aggregate (cost=22975.15..22990.27 rows=302 width=24) > -> Group (cost=22975.15..22982.71 rows=3025 width=24) > -> Sort (cost=22975.15..22975.15 rows=3025 width=24) > -> Seq Scan on tbas_transactions (cost=0.00..22800.29 > rows=3025 width=24) > Time: 25.9s > With indexscan=on: > Aggregate (cost=10874.64..10889.76 rows=302 width=24) > -> Group (cost=10874.64..10882.20 rows=3025 width=24) > -> Sort (cost=10874.64..10874.64 rows=3025 width=24) > -> Index Scan using trx_trxtype_idx on tbas_transactions > (cost=0.00..10699.78 rows=3025 width=24) > Time: 24.9s > The point is I need to run this query in a fraction of the above time, > otherwise I'm in deep trouble. > Any suggestion is welcome. Hmm, in general a multicolumn index on (trxtype,account) might enable it to get rid of the sort step.
On Fri, 6 Sep 2002, Mihai Gheorghiu wrote: > PG7.1.3 on RH7.1 on Dell PowerEdge 2500SC P3-933, 1GB RAM, 18GB SCSI160 > There are 350k rows with trxtype=MP > With indexscan=off: > Aggregate (cost=22975.15..22990.27 rows=302 width=24) > -> Group (cost=22975.15..22982.71 rows=3025 width=24) > -> Sort (cost=22975.15..22975.15 rows=3025 width=24) > -> Seq Scan on tbas_transactions (cost=0.00..22800.29 > rows=3025 width=24) > Time: 25.9s > With indexscan=on: > Aggregate (cost=10874.64..10889.76 rows=302 width=24) > -> Group (cost=10874.64..10882.20 rows=3025 width=24) > -> Sort (cost=10874.64..10874.64 rows=3025 width=24) > -> Index Scan using trx_trxtype_idx on tbas_transactions > (cost=0.00..10699.78 rows=3025 width=24) > Time: 24.9s > The point is I need to run this query in a fraction of the above time, > otherwise I'm in deep trouble. > Any suggestion is welcome. > Yes, drop the index on trxtype so that it's selecting on isposted. Obviously, knowing things like the actual number of rows returned by explain analyze (only 7.2.x?) as per my other message would be more usedul. Your trxtype index is actually selecting 50% of the rows but thinks it only has 3025...hmmmm...typing that made me think something is wrong...ah, I see 7.1.3, not 7.2.x...how many other values for this field are there? Even 7.1.3 should have caught that as a common value right? However, in short, unless isposted is true for lots of those 350k rows you dropping the index would hopefully prompt a switch to an index on that column. Or, create a multicolumn index on (isposted,trxtype). BTW, there must be some sort of data clustering going because that index scan is faster than the sequential scan even though it's fetching 50% of the table. Perhaps it's a result of caching. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
I created an index over account and trxtype. Here is the result of explain (with indexscan=on): Aggregate (cost=10874.64..10889.76 rows=302 width=24) -> Group (cost=10874.64..10882.20 rows=3025 width=24) -> Sort (cost=10874.64..10874.64 rows=3025 width=24) -> Index Scan using trx_trxtype_idx on tbas_transactions (cost=0.00..10699.78 rows=3025 width=24) I checked, and the two-column index is there, in the pd_indexes table. -----Original Message----- From: Stephan Szabo <sszabo@megazone23.bigpanda.com> To: Mihai Gheorghiu <tanethq@earthlink.net> Cc: pgsql-general@postgresql.org <pgsql-general@postgresql.org> Date: Friday, September 06, 2002 2:11 PM Subject: Re: [GENERAL] Surprise :-( > >On Fri, 6 Sep 2002, Mihai Gheorghiu wrote: > >> PG7.1.3 on RH7.1 on Dell PowerEdge 2500SC P3-933, 1GB RAM, 18GB SCSI160 >> There are 350k rows with trxtype=MP >> With indexscan=off: >> Aggregate (cost=22975.15..22990.27 rows=302 width=24) >> -> Group (cost=22975.15..22982.71 rows=3025 width=24) >> -> Sort (cost=22975.15..22975.15 rows=3025 width=24) >> -> Seq Scan on tbas_transactions (cost=0.00..22800.29 >> rows=3025 width=24) >> Time: 25.9s >> With indexscan=on: >> Aggregate (cost=10874.64..10889.76 rows=302 width=24) >> -> Group (cost=10874.64..10882.20 rows=3025 width=24) >> -> Sort (cost=10874.64..10874.64 rows=3025 width=24) >> -> Index Scan using trx_trxtype_idx on tbas_transactions >> (cost=0.00..10699.78 rows=3025 width=24) >> Time: 24.9s >> The point is I need to run this query in a fraction of the above time, >> otherwise I'm in deep trouble. >> Any suggestion is welcome. > >Hmm, in general a multicolumn index on (trxtype,account) might enable it >to get rid of the sort step. > >
I ran select from pg_statistics... as you advised The result is attached. Col# Name 5 account 10 trxtype 15 amount 28 isposted I must admit I cannot make very much sense out of it. What does it tell? Thank you very much. P.S. I am running PG7.1.3. Is explain analyze an improvement in 7.2? -----Original Message----- From: Nigel J. Andrews <nandrews@investsystems.co.uk> To: Mihai Gheorghiu <tanethq@earthlink.net> Cc: pgsql-general@postgresql.org <pgsql-general@postgresql.org> Date: Thursday, September 05, 2002 7:07 PM Subject: Re: [GENERAL] Surprise :-( >On Thu, 5 Sep 2002, Mihai Gheorghiu wrote: > >> >> explain select account, sum(amount) from tbas_transactions where isposted >> and trxtype = 'MP' group by account; >> psql:xx.txt:1: NOTICE: QUERY PLAN: >> >> Aggregate (cost=10874.64..10889.76 rows=302 width=24) >> -> Group (cost=10874.64..10882.20 rows=3025 width=24) >> -> Sort (cost=10874.64..10874.64 rows=3025 width=24) >> -> Index Scan using trx_trxtype_idx on tbas_transactions >> (cost=0.00..10699.78 rows=3025 width=24) >> >> EXPLAIN >> >> Sorry, I do not have an explain from before vacuum analyze. >> The table has ~700k rows and indices on account, trxtype and a few other >> fields used in other queries. >> >> >> -----Original Message----- >> From: Stephan Szabo <sszabo@megazone23.bigpanda.com> >> To: Mihai Gheorghiu <tanethq@earthlink.net> >> Cc: pgsql-general@postgresql.org <pgsql-general@postgresql.org> >> Date: Thursday, September 05, 2002 4:34 PM >> Subject: Re: [GENERAL] Surprise :-( >> >> >> >On Thu, 5 Sep 2002, Mihai Gheorghiu wrote: >> > >> >> I ran a vacuum analyze on a database. Now the query I ran vacuum analyze >> for >> >> takes twice as long, and all the other queries I tested take longer, too. >> >> Please help. >> > >> >What are the queries and explain output for the queries (preferably >> >including the old state if you have explain from that as well). >> > > >Okay, so, on the face of it it's a pretty good plan, it has chosen an index >scan returning only 3025 tuples out of 7000,000 after all. What does EXPLAIN >ANALYZE <your query> show? > >Also you could try and replicate the previous plan by doing a SET >ENABLE_INDEXSCAN = OFF. Although this is a unlikely to give the previous plan >imo. A better approach would be to drop the index currently used, once sure of >being able to recreate it of course, and repeating as necessary to see which >index was giving the previous level performance. > >What does SELECT * FROM pg_statistic WHERE starelid = (SELECT oid FROM pg_class >WHERE relname = 'tbas_transactions'); give? Don't forget to indicate which >column of your table has which attribute number in that output. > >Assuming the name of the index used is descriptive the difference must be due >to isposted being true being more selective than the trxtype test. It starts to >get difficult at this point without knowing the explain analyze results and >something like SELECT isposted, count(1) FROM tbas_transactions GROUP BY >isposted. > > >-- >Nigel J. Andrews >Director > >--- >Logictree Systems Limited >Computer Consultants > > > >
Attachment
Only one of the 350k trxtype=MP records has isposted=false !!! I am downloading 7.2.2 right now, and, God willing, on Monday I will be able to apply your recommendations regarding explain analyze. Which leads me to the following question: I dump this database with pgdump in less than 15 minutes and restore it with psql -d mydata -f mydata.dmp in 375 minutes. Is there a faster way? Thanks again. -----Original Message----- From: Nigel J. Andrews <nandrews@investsystems.co.uk> To: Mihai Gheorghiu <tanethq@earthlink.net> Cc: pgsql-general@postgresql.org <pgsql-general@postgresql.org> Date: Friday, September 06, 2002 2:27 PM Subject: Re: [GENERAL] Surprise :-( >On Fri, 6 Sep 2002, Mihai Gheorghiu wrote: > >> PG7.1.3 on RH7.1 on Dell PowerEdge 2500SC P3-933, 1GB RAM, 18GB SCSI160 >> There are 350k rows with trxtype=MP >> With indexscan=off: >> Aggregate (cost=22975.15..22990.27 rows=302 width=24) >> -> Group (cost=22975.15..22982.71 rows=3025 width=24) >> -> Sort (cost=22975.15..22975.15 rows=3025 width=24) >> -> Seq Scan on tbas_transactions (cost=0.00..22800.29 >> rows=3025 width=24) >> Time: 25.9s >> With indexscan=on: >> Aggregate (cost=10874.64..10889.76 rows=302 width=24) >> -> Group (cost=10874.64..10882.20 rows=3025 width=24) >> -> Sort (cost=10874.64..10874.64 rows=3025 width=24) >> -> Index Scan using trx_trxtype_idx on tbas_transactions >> (cost=0.00..10699.78 rows=3025 width=24) >> Time: 24.9s >> The point is I need to run this query in a fraction of the above time, >> otherwise I'm in deep trouble. >> Any suggestion is welcome. >> > >Yes, drop the index on trxtype so that it's selecting on isposted. Obviously, >knowing things like the actual number of rows returned by explain analyze (only >7.2.x?) as per my other message would be more usedul. > >Your trxtype index is actually selecting 50% of the rows but thinks it only has >3025...hmmmm...typing that made me think something is wrong...ah, I see 7.1 .3, >not 7.2.x...how many other values for this field are there? Even 7.1.3 should >have caught that as a common value right? However, in short, unless isposted is >true for lots of those 350k rows you dropping the index would hopefully prompt >a switch to an index on that column. Or, create a multicolumn index on >(isposted,trxtype). > >BTW, there must be some sort of data clustering going because that index scan >is faster than the sequential scan even though it's fetching 50% of the >table. Perhaps it's a result of caching. > > >-- >Nigel J. Andrews >Director > >--- >Logictree Systems Limited >Computer Consultants >
On Fri, 6 Sep 2002, Mihai Gheorghiu wrote: > I ran select from pg_statistics... as you advised > The result is attached. > Col# Name > 5 account > 10 trxtype > 15 amount > 28 isposted > I must admit I cannot make very much sense out of it. What does it tell? > Thank you very much. > P.S. I am running PG7.1.3. Is explain analyze an improvement in 7.2? > > >> > >> explain select account, sum(amount) from tbas_transactions where isposted > >> and trxtype = 'MP' group by account; > >> psql:xx.txt:1: NOTICE: QUERY PLAN: > >> > >> Sorry, I do not have an explain from before vacuum analyze. > >> The table has ~700k rows and indices on account, trxtype and a few other > >> fields used in other queries. First, I have been assuming you're working on a non-essential and/or non-production database where doing such things as deleting indexes is an acceptable cost to determine and attempt to fix the speed problems you are experiencing. I wouldn't have suggested such things otherwise. Second, I believe you also stated that number of rows in this table with trxtype = 'MP' is about 350k, i.e. 50%. The pg_statistic output you shown shows 'RG' as the most common value with low and high values as thinks 'AS' and 'XP'. I think based on this information the statistics stand a chance of being incorrect and you should try and confirm the distribution of values in this column. Doing SELECT trxtype, count(1) FROM tbas_transactions GROUP BY trxtype ORDER BY trxtype DESC LIMIT 10 would be instructive. The limit number is somewhat arbitrary, the most interesting results of that query will be the first and probably second row returned and the row where trxtype is 'MP'. However, as it stands I suggest you should do a VACUUM VERBOSE ANALYZE tbas_transactions and retest your slow query. If no significant improvement it would be a good idea to show us the output of that vacuum command and the same pg_statistic entries as before but taken after this vacuum. As for the data loading into 7.2.2 taking a long time. I can't really suggest anything. You may find the 7.2.2 load does take less time than the 7.1.3 you tested it on. I think everyone would also recommend doing the upgrade even with this load time. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
"Mihai Gheorghiu" <tanethq@earthlink.net> writes: > Only one of the 350k trxtype=MP records has isposted=false !!! > I am downloading 7.2.2 right now, and, God willing, on Monday I will be able > to apply your recommendations regarding explain analyze. > Which leads me to the following question: > I dump this database with pgdump in less than 15 minutes and restore it with > psql -d mydata -f mydata.dmp in 375 minutes. > Is there a faster way? > Thanks again. Uh... You might want to skip ahead and go right to the new 7.3beta1. Otherwise you are probably going to *want* to do another dump reload cycle in the very near future. Just a thought. Jason
I've ran into similar oddnesses twice in the past two months. In both cases, a developper came to me and said that a particular query ran very quickly the day the table was created and populated, and ran very slowly starting the next day, after the DB had been VACUUM'ed. In both cases, their WHERE clause used operations such as concatenations on fields that did not have indexes. In both cases, creating the indexes speeded up the query, dropping the cost from some obscene number to a very low one. ( on the order of 8000 to 60). I'm not sure why the queries worked well before the database was VACUUM'ed, but it's happened to me twice now - and it sounds similar enough to your situation that you may want to look at whether you have appropriate indexes created on your table. steve
This is the result of the statistic/count query: trxtype | count ---------+-------- MP | 347529 AS | 92273 PR | 56664 TS | 37756 RG | 30438 PK | 24764 UP | 14930 EX | 10285 PD | 7817 OT | 4149 WW | 2948 PO | 2568 VO | 728 XP | 2 LF | 1 (15 rows) -----Original Message----- From: Nigel J. Andrews <nandrews@investsystems.co.uk> To: Mihai Gheorghiu <tanethq@earthlink.net> Cc: pgsql-general@postgresql.org <pgsql-general@postgresql.org> Date: Friday, September 06, 2002 6:44 PM Subject: Re: [GENERAL] Surprise :-( > >On Fri, 6 Sep 2002, Mihai Gheorghiu wrote: > >> I ran select from pg_statistics... as you advised >> The result is attached. >> Col# Name >> 5 account >> 10 trxtype >> 15 amount >> 28 isposted >> I must admit I cannot make very much sense out of it. What does it tell? >> Thank you very much. >> P.S. I am running PG7.1.3. Is explain analyze an improvement in 7.2? >> >> >> >> >> explain select account, sum(amount) from tbas_transactions where isposted >> >> and trxtype = 'MP' group by account; >> >> psql:xx.txt:1: NOTICE: QUERY PLAN: >> >> >> >> Sorry, I do not have an explain from before vacuum analyze. >> >> The table has ~700k rows and indices on account, trxtype and a few other >> >> fields used in other queries. > >First, I have been assuming you're working on a non-essential and/or >non-production database where doing such things as deleting indexes is an >acceptable cost to determine and attempt to fix the speed problems you are >experiencing. I wouldn't have suggested such things otherwise. > >Second, I believe you also stated that number of rows in this table with >trxtype = 'MP' is about 350k, i.e. 50%. The pg_statistic output you shown >shows 'RG' as the most common value with low and high values as thinks 'AS' and >'XP'. I think based on this information the statistics stand a chance of being >incorrect and you should try and confirm the distribution of values in this >column. Doing > > SELECT trxtype, count(1) > FROM tbas_transactions > GROUP BY trxtype > ORDER BY trxtype DESC > LIMIT 10 > >would be instructive. The limit number is somewhat arbitrary, the most >interesting results of that query will be the first and probably second row >returned and the row where trxtype is 'MP'. > >However, as it stands I suggest you should do a > > VACUUM VERBOSE ANALYZE tbas_transactions > >and retest your slow query. If no significant improvement it would be a good >idea to show us the output of that vacuum command and the same pg_statistic >entries as before but taken after this vacuum. > >As for the data loading into 7.2.2 taking a long time. I can't really suggest >anything. You may find the 7.2.2 load does take less time than the 7.1.3 you >tested it on. I think everyone would also recommend doing the upgrade even with >this load time. > > >-- >Nigel J. Andrews >Director > >--- >Logictree Systems Limited >Computer Consultants > > >
On Mon, 9 Sep 2002, Mihai Gheorghiu wrote: > This is the result of the statistic/count query: > trxtype | count > ---------+-------- > MP | 347529 > ... > RG | 30438 > ... > > > >On Fri, 6 Sep 2002, Mihai Gheorghiu wrote: > > > >> I ran select from pg_statistics... as you advised > >> The result is attached. > >> Col# Name > >> 5 account > >> 10 trxtype > >> 15 amount > >> 28 isposted starelid,staattnum,staop,stanullfrac,stacommonfrac,stacommonval,staloval,stahival ---------,----------,------,------------,--------------,--------------------------,--------------------------,----------------------------------- 55256329,10,1058,0,0.0479733,RG,AS,XP RG fraction of table = 30 / 700 = 5% (as pg_statistics shows) MP fraction of table = 350 / 700 = 50% (expected in pg_statistics) May be someone else can verify that 7.1.3 had problems with stats gathering but all this just suggests to me that the table hasn't been analysed. Are you sure it was done? Try what I suggest below, to do just the one table, when you have time and if it's still appropiate. Also you could run the command: UPDATE pg_statistics SET sttcommonval = 'MP', stacommonfrac = 0.5 WHERE statrelid = 55256329 AND staattnum = 10; which will make pg_statistics reflect reality more accurate. However, that is just going to make the planner choose the seqscan plan over the index scan and doesn't explain why the statistics weren't updated properly by your vacuum analyze. Are you also sure that this same query on the same data ran faster before the vacuum? Perhaps the fast query used a different value in the trxtype test? It boils down to something must have changed and if it's not the data it must be the query. [as a reminder...] > >> >> > >> >> explain select account, sum(amount) from tbas_transactions where > isposted > >> >> and trxtype = 'MP' group by account; > >> >> > >> >> Sorry, I do not have an explain from before vacuum analyze. > >> >> The table has ~700k rows and indices on account, trxtype and a few > other > >> >> fields used in other queries. > > And an interesting item to perform... > >However, as it stands I suggest you should do a > > > > VACUUM VERBOSE ANALYZE tbas_transactions > > > >and retest your slow query. If no significant improvement it would be a > good > >idea to show us the output of that vacuum command and the same pg_statistic > >entries as before but taken after this vacuum. > > I hope the 7.2.2 install is going well. It will be interesting see what you make of it regarding this issue. I can't see it taking a significantly shorter time for this query on this data. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants