Thread: Optimizing joins
I have some simple joins that are taking FOREVER to run - on a couple tables with only maybe a hundred thousand rows. I've done vacuum analyze on the tables but they still seem to take a lot of time. Where should I go first for optimizing my postgres server to handle this better? Are there memory options to increase a buffer in postgres, or are there specific ways to optimize the query itself?
Thanks,
Eric
On Mon, 6 May 2002, Eric Peters wrote: > I have some simple joins that are taking FOREVER to run - on a couple > tables with only maybe a hundred thousand rows. I've done vacuum > analyze on the tables but they still seem to take a lot of time. Where > should I go first for optimizing my postgres server to handle this > better? Are there memory options to increase a buffer in postgres, or > are there specific ways to optimize the query itself? Two things: 1: use explain to see what's taking so long. 2: create indexes on the fields being used to join the tables. If both of those are done, and the indexes are being used, then you can increase memory buffers, but your kernel has to know how to hand out more shared memory to postgresql or postgresql will fail to start. If the indexes are NOT being used after an analyze, then you can alter the numbers the postgresql server uses when deciding on the "cost" of different options. random_page_cost cpu_tuple_cost cpu_index_tuple_cost cpu_operator_cost are the options. Use show <setting> to see what it's set to, set <setting>=number to set them. Turning down the cpu_index_tuple_cost and the random_page_cost favors indexes, turning down cpu_operator_cost favors sequential scans. If explain shows your planner using a hashjoin, try set enable_hashjoin = off and see if it runs faster. Near as I can tell, hash joins are just plain slow on postgresql compared to almost any other option.