Re: Strange (?) Index behavior? - Mailing list pgsql-performance
From | Joshua D. Drake |
---|---|
Subject | Re: Strange (?) Index behavior? |
Date | |
Msg-id | 4193DE2B.5060407@commandprompt.com Whole thread Raw |
In response to | Re: Strange (?) Index behavior? (Allen Landsidel <alandsidel@gmail.com>) |
List | pgsql-performance |
>>>----------------------------------------------------------------------------------------------------------------------------------------------- >>> Index Scan using sname_unique on "testtable" (cost=0.00..34453.74 >>>rows=8620 width=20) (actual time=77.004..537065.079 rows=74612 >>>loops=1) >>> Index Cond: ((sname >= 'AA'::text) AND (sname < 'AB'::text)) >>> Filter: (sname ~~ 'AA%'::text) >>> Total runtime: 537477.737 ms >>>(4 rows) >>> >>>Time: 537480.571 ms >> >>Nothing you're going to do to the query is going to come up with a more >>effective plan than this. It's using the index after all. It's never going to >>be lightning fast because it has to process 75k rows. >> >>However 75k rows shouldn't be taking nearly 10 minutes. It should be taking >>about 10 seconds. I am confused about this statement. I have a table with 1.77 million rows that I use gist indexes on (TSearch) and I can pull out of it in less than 2 seconds. Are you saying it should be taking 10 seconds because of the type of plan? 10 seconds seems like an awfullong time for this. Sincerely, Joshua D. Drake > > > That's my feeling as well, I thought the index was to blame because it > will be quite large, possibly large enough to not fit in memory nor be > quickly bursted up. > > >>The 77ms before finding the first record is a bit suspicious. Have you >>vacuumed this table regularly? Try a VACUUM FULL VERBOSE, and send the >>results. You might try to REINDEX it as well, though I doubt that would help. > > > This table is *brand spanking new* for lack of a better term. I have > the data for it in a CSV. I load the CSV up which takes a bit, then > create the indexes, do a vacuum analyze verbose, and then posted the > results above. I don't think running vacuum a more times is going to > change things, at least not without tweaking config settings that > affect vacuum. Not a single row has been inserted or altered since the > initial load.. it's just a test. > > I can't give vacuum stats right now because the thing is reloading > (again) with different newfs settings -- something I figure I have the > time to fiddle with now, and seldom do at other times. These numbers > though don't change much between 8K on up to 64K 'cluster' sizes. I'm > trying it now with 8K page sizes, with 8K "minimum fragment" sizes. > Should speed things up a tiny bit but not enough to really affect this > query. > > Do you still see a need to have the output from the vacuum? > > >>Actually you might consider clustering the table on sname_unique. That would >>accomplish the same thing as the VACUUM FULL command and also speed up the >>index scan. And the optimizer knows (if you analyze afterwards) it so it >>should be more likely to pick the index scan. But currently you have to rerun >>cluster periodically. > > > Clustering is really unworkable in this situation. It would work now, > in this limited test case, but using it if this were to go into > production is unrealistic. It would have to happen fairly often since > this table is updated frequently, which will break the clustering > quickly with MVCC. > > Running it often.. well.. it has 70M+ rows, and the entire table is > copied, reordered, and rewritten.. so that's a lot of 'scratch space' > needed. Finally, clustering locks the table.. > > Something I'd already considered but quickly ruled out because of > these reasons.. > > More ideas are welcome though. ;) > > -Allen > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- Command Prompt, Inc., home of PostgreSQL Replication, and plPHP. Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Attachment
pgsql-performance by date: