Thread: Performance issue - 2 linux machines, identical configs, different performance
Performance issue - 2 linux machines, identical configs, different performance
Hi, sorry about the blank post yesterday – let’s try again
We have two machines. Both running Linux Redhat, both running postgres 8.2.5.
Both have nearly identical 125 GB databases. In fact we use PITR Recovery to
Replicate from one to the other. The machine we replicate to runs a query with
About 10 inner and left joins about 5 times slower than the original machine
I run an explain on both. Machine1 (original) planner favors hash joins about 3 to 1
Over nested loop joins. Machine2 (replicated) uses only nested loop joins – no hash at all.
A few details – I can always provide more
MACHINE1 – original:
TOTAL RAW MEMORY – 30 GB
TOTAL SHARED MEMORY (shmmax value) – 4 GB
Database configs
SHARED_BUFFERS ------------– 1525 MB
MAX_PREPARED_TRANSACTIONS – 5
WORK_MEM – -------------------- 300 MB
MAINTENANCE_WORK_MEM - 512 MB
MAX_FSM_PAGES -------------- 3,000,000
CHECKPOINT_SEGMENTS ----- 64
WAL_BUFFERS --------------------- 768
EFFECTIVE_CACHE_SIZE ---- 2 GB
Planner method configs all turned on by default, including enable_hashjoin
MACHINE2 – we run 2 postgres instances. Port 5433 runs continuous PITR recoveries
Port 5432 receives the ‘latest and greatest’ database when port 5433 finishes a recovery
TOTAL RAW MEMORY – 16 GB (this is a VMWARE setup on a netapp)
TOTAL SHARED MEMORY (shmmax value) – 4 GB
Database configs – port 5432 instance
SHARED_BUFFERS -----------– 1500 MB
MAX_PREPARED_TRANSACTIONS – 1 (we don’t run prepared transactions here)
WORK_MEM – -------------------- 300 MB
MAINTENANCE_WORK_MEM - 100 MB (don’t think this comes into play in this conversation)
MAX_FSM_PAGES -------------- 1,000,000
CHECKPOINT_SEGMENTS ----- 32
WAL_BUFFERS --------------------- 768
EFFECTIVE_CACHE_SIZE ---- 2 GB
Planner method configs all turned on by default, including enable_hashjoin
Database configs – port 5433 instance
SHARED_BUFFERS -----------– 1500 MB
MAX_PREPARED_TRANSACTIONS – 1 (we don’t run prepared transactions here)
WORK_MEM – -------------------- 250 MB
MAINTENANCE_WORK_MEM - 100 MB (don’t think this comes into play in this conversation)
MAX_FSM_PAGES -------------- 1,000,000
CHECKPOINT_SEGMENTS ----- 32
WAL_BUFFERS --------------------- 768
EFFECTIVE_CACHE_SIZE ---- 2 GB
Planner method configs all turned on by default, including enable_hashjoin
Now some size details about the 11 tables involved in the join
All join fields are indexed unless otherwise noted and are of type integer unless otherwise noted
TABLE1 -------------398 pages
TABLE2 -------- 5,014 pages INNER JOIN on TABLE1
TABLE3 ------- 34,729 pages INNER JOIN on TABLE2
TABLE4 ----1,828,000 pages INNER JOIN on TABLE2
TABLE5 ----1,838,000 pages INNER JOIN on TABLE4
TABLE6 ------ 122,500 pages INNER JOIN on TABLE4
TABLE7 ----------- 621 pages INNER JOIN on TABLE6
TABLE8 ---------- 4 pages INNER JOIN on TABLE7 (TABLE7 column not indexed)
TABLE9 ----------- 2 pages INNER JOIN on TABLE8 (TABLE8 column not indexed)
TABLE10 --------- 13 pages LEFT JOIN on TABLE6 (columns on both tables text, neither column indexed)
TABLE11 -1,976,430 pages LEFT JOIN on TABLE5. AND explicit join on TABLE6
The WHERE clause filters out primary key values from TABLE1 to 1 value and a 1 month range of
Indexed dates from TABLE4.
So, my guess is the disparity of performance (40 seconds vs 180 seconds) has to do with MACHINE2 not
Availing itself of hash joins which by my understanding is much faster.
Any help / insight appreciated. Thank you
Mark Steben│Database Administrator│
@utoRevenue® "Join the Revenue-tion"
95 Ashley Ave. West Springfield, MA., 01089
413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax)
@utoRevenue is a registered trademark and a division of Dominion Enterprises
Re: Performance issue - 2 linux machines, identical configs, different performance
2009/6/17 Mark Steben <msteben@autorevenue.com>: > A few details – I can always provide more Could you send: 1. Exact text of query. 2. EXPLAIN ANALYZE output on each machine. 3. VACUUM VERBOSE output on each machine, or at least the last 10 lines. ...Robert
Re: Performance issue - 2 linux machines, identical configs, different performance
>We have two machines. Both running Linux Redhat, both running postgres 8.2.5. >Both have nearly identical 125 GB databases. In fact we use PITR Recovery to >Replicate from one to the other. I have to ask the obvious question. Do you regularly analyze the machine you replicate too? Dave
Re: Performance issue - 2 linux machines, identical configs, different performance
Yes I analyze after each replication. Mark Steben│Database Administrator│ @utoRevenue-R- "Join the Revenue-tion" 95 Ashley Ave. West Springfield, MA., 01089 413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax) @utoRevenue is a registered trademark and a division of Dominion Enterprises -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Dave Dutcher Sent: Wednesday, June 17, 2009 1:39 PM To: 'Mark Steben'; pgsql-performance@postgresql.org Cc: 'Rich Garabedian' Subject: Re: [PERFORM] Performance issue - 2 linux machines, identical configs, different performance >We have two machines. Both running Linux Redhat, both running postgres 8.2.5. >Both have nearly identical 125 GB databases. In fact we use PITR Recovery to >Replicate from one to the other. I have to ask the obvious question. Do you regularly analyze the machine you replicate too? Dave -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance