PG 9.x prefers slower Hash Joins? - Mailing list pgsql-performance

From Anthony Presley
Subject PG 9.x prefers slower Hash Joins?
Date
Msg-id CAO2AxyrKjemyjGjnBUWnkwAANj09cgjqGcu66gJ70u7Z0RkX_A@mail.gmail.com
Whole thread Raw
Responses Re: PG 9.x prefers slower Hash Joins?
List pgsql-performance
In relation to my previous thread (about SW RAID vs. HW RAID on a P400), I was able to narrow down the filesystem speed and in general, our new system (running PG 9.1) is about 3x - 5x faster on the IO.

In looking at the query plans in more depth, it appears that PG 9.0 and 9.1 are both preferring to do hash joins, which seem to have a "linear" time and are slower than PG 8.4 doing an index scan.

For example, on PG 9.x:
  http://explain.depesz.com/s/qji - This takes 307ms, all the time.  Doesn't matter if it's "cached", or fresh from a reboot.

Same query on PG 8.4:
  http://explain.depesz.com/s/8Pd - This can take 2-3s the first time, but then takes 42ms once it's cached.

Both of these servers have the same indexes, similar postgresql.conf, and almost identical data.  However, the old server is doing some different planning than the new server.  I've run analyze on both of these databases.  Some relevant PG parameters:

  max_connections = 150
  shared_buffers = 6400MB (have tried as high as 20GB)
  work_mem = 20MB (have tried as high as 100MB)
  effective_io_concurrency = 6
  fsync = on
  synchronous_commit = off
  wal_buffers = 16MB
  checkpoint_segments = 30  (have tried 200 when I was loading the db)
  random_page_cost = 2.5
  effective_cache_size = 10240MB  (have tried as high as 16GB)

If I disable the hashjoin, I get massive improvements on PG 9.x ... as fast (or faster) than our PG 8.4 instance.


--
Anthony Presley

pgsql-performance by date:

Previous
From: Arjen van der Meijden
Date:
Subject: Re: RAID Controller (HP P400) beat by SW-RAID?
Next
From: Marti Raudsepp
Date:
Subject: Re: Postgres for a "data warehouse", 5-10 TB