Re: Strange (?) Index behavior? - Mailing list pgsql-performance
From | Allen Landsidel |
---|---|
Subject | Re: Strange (?) Index behavior? |
Date | |
Msg-id | 88f1825a04111113125a0fc443@mail.gmail.com Whole thread Raw |
In response to | Re: Strange (?) Index behavior? (Greg Stark <gsstark@mit.edu>) |
Responses |
Re: Strange (?) Index behavior?
Re: Strange (?) Index behavior? |
List | pgsql-performance |
On 11 Nov 2004 15:49:46 -0500, Greg Stark <gsstark@mit.edu> wrote: > Allen Landsidel <alandsidel@gmail.com> writes: > > > > > QUERY PLAN > > ----------------------------------------------------------------------------------------------------------------------------------------------- > > Index Scan using sname_unique on "testtable" (cost=0.00..34453.74 > > rows=8620 width=20) (actual time=77.004..537065.079 rows=74612 > > loops=1) > > Index Cond: ((sname >= 'AA'::text) AND (sname < 'AB'::text)) > > Filter: (sname ~~ 'AA%'::text) > > Total runtime: 537477.737 ms > > (4 rows) > > > > Time: 537480.571 ms > > Nothing you're going to do to the query is going to come up with a more > effective plan than this. It's using the index after all. It's never going to > be lightning fast because it has to process 75k rows. > > However 75k rows shouldn't be taking nearly 10 minutes. It should be taking > about 10 seconds. That's my feeling as well, I thought the index was to blame because it will be quite large, possibly large enough to not fit in memory nor be quickly bursted up. > The 77ms before finding the first record is a bit suspicious. Have you > vacuumed this table regularly? Try a VACUUM FULL VERBOSE, and send the > results. You might try to REINDEX it as well, though I doubt that would help. This table is *brand spanking new* for lack of a better term. I have the data for it in a CSV. I load the CSV up which takes a bit, then create the indexes, do a vacuum analyze verbose, and then posted the results above. I don't think running vacuum a more times is going to change things, at least not without tweaking config settings that affect vacuum. Not a single row has been inserted or altered since the initial load.. it's just a test. I can't give vacuum stats right now because the thing is reloading (again) with different newfs settings -- something I figure I have the time to fiddle with now, and seldom do at other times. These numbers though don't change much between 8K on up to 64K 'cluster' sizes. I'm trying it now with 8K page sizes, with 8K "minimum fragment" sizes. Should speed things up a tiny bit but not enough to really affect this query. Do you still see a need to have the output from the vacuum? > Actually you might consider clustering the table on sname_unique. That would > accomplish the same thing as the VACUUM FULL command and also speed up the > index scan. And the optimizer knows (if you analyze afterwards) it so it > should be more likely to pick the index scan. But currently you have to rerun > cluster periodically. Clustering is really unworkable in this situation. It would work now, in this limited test case, but using it if this were to go into production is unrealistic. It would have to happen fairly often since this table is updated frequently, which will break the clustering quickly with MVCC. Running it often.. well.. it has 70M+ rows, and the entire table is copied, reordered, and rewritten.. so that's a lot of 'scratch space' needed. Finally, clustering locks the table.. Something I'd already considered but quickly ruled out because of these reasons.. More ideas are welcome though. ;) -Allen
pgsql-performance by date: