Re: PostgreSQL vs. MySQL - Mailing list pgsql-performance
From | Stephan Szabo |
---|---|
Subject | Re: PostgreSQL vs. MySQL |
Date | |
Msg-id | 20030704094307.B21273-100000@megazone23.bigpanda.com Whole thread Raw |
In response to | Re: PostgreSQL vs. MySQL ("Brian Tarbox" <btarbox@theworld.com>) |
Responses |
Re: PostgreSQL vs. MySQL
|
List | pgsql-performance |
On Fri, 4 Jul 2003, Brian Tarbox wrote: > > I don't think Brian has any interest in being helped. > >I suspect he'd made up his mind already. > > > With all due respect Tom, I don't think I'm the one demonstrating a closed > mind. > Rather than trying to figure out whats going on in my head, how about > figuring out whats going on in my database? :-) Well, in the case of getting a sequential scan on something like select * from foo where col=10; where col is a primary key, the things I can think of to check are does select * from foo where col='10'; give a different plan? In general for cases where you can't post queries or schema we're kinda stuck and not really able to give intelligent advice since it's often schema/query specific, so the general questions/comments are things like (which you've probably heard, but I think they should get put into this thread if only to move the thread towards usefulness) What is the relative costs/plan if you set enable_seqscan to false before explain analyzing the query? If there are places you think that it should be able to do an index scan and it still doesn't, make sure that there aren't cross datatype issues (especially with int constants). Also, using explain analyze, where is the time being taken, it's often not where the cost factor would expect it. Do the row estimates match reality in the explain analyze output, if not does analyzing help, if not does raising the statistics target (to say 50, 100, 1000) with alter table and then analyzing help? Does vacuuming help, what about vacuum full? If the latter does and the former doesn't, you may need to look at raising the fsm settings. If shared_buffers is less than 1000, does setting it to something between 1000-8000 raise performance? How much memory does the machine have that's being used for caching, if it's alot, try raising effective_cache_size to see if that helps the choice of plan by making a more reasonable guess as to cache hit rates. Are there any sorts in the query, if so, how large would expect the result set that's being sorted to be, can you afford to make sort_mem cover that (either permanently by changing conf files or before the query with a set command)? Is it possible to avoid some sorts in the plan with a multi-column index? For 7.3 and earlier, does the query use IN or =ANY, if so it might help to try to convert to an exists form. Does the query use any mix/max aggregates, it might help to look for a workaround, this is one case that is truly slow. PostgreSQL really does require more than minimal optimization at start, effective_cache_size, shared_buffers, sort_mem and the fsm settings really need to be set at a level for the machine/queries you have. Without the queries we can't be too specific. Big speed losses I can think of are the datatype mismatch confusion, followed quickly by row estimates that don't match reality (generally requiring a greater statistics target on the column) and issues with correlation (I'm not really sure there's a good solution for this currently, maybe someone will know -- I've not run into it really on anything I've looked at).
pgsql-performance by date: