Re: postgres performance: comparing 2 data centers - Mailing list pgsql-performance
From | Michael Nonemacher |
---|---|
Subject | Re: postgres performance: comparing 2 data centers |
Date | |
Msg-id | E3A41572DB871B42AB6939873D95E8CA03874D@auscorpex-1.austin.messageone.com Whole thread Raw |
In response to | postgres performance: comparing 2 data centers ("Michael Nonemacher" <Michael_Nonemacher@messageone.com>) |
Responses |
Re: postgres performance: comparing 2 data centers
Re: postgres performance: comparing 2 data centers Re: postgres performance: comparing 2 data centers |
List | pgsql-performance |
Slight update: Thanks for the replies; this is starting to make a little more sense... I've managed to track down the root of the problem to a single query on a single table. I have a query that looks like this: select count(*) from members where group_id = ? and member_id > 0; The members table contains about 500k rows. It has an index on (group_id, member_id) and on (member_id, group_id). It seems like the statistics are wildly different depending on whether the last operation on the table was a 'vacuum analyze' or an 'analyze'. Vacuum or vacuum-analyze puts the correct number (~500k) in pg_class.reltuples, but analyze puts 7000 in pg_class.reltuples. The reltuples numbers associated with this table's indexes are unchanged. After a vacuum-analyze, the query correctly uses the index on (group_id, member_id), and runs very fast (sub-millisecond reported by explain analyze). After an analyze, the query uses the (member_id, group_id) index, and the query takes much longer (150ms reported by explain analyze). (Yes, I said the 2 databases were using the same query plan; it turns out they're only sometimes using the same query plan. :( ) A similar problem happens to some of my other tables (according to pg_class.reltuples), although I haven't seen query performance change as much. Any idea what could cause this bad analyze behavior? Any guesses why this has happened in one of my data centers but not both? (Coincidence isn't a big stretch here.) What can I do to stop or change this behavior? Apologies if this is a known problem... mike -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Michael Nonemacher Sent: Friday, June 04, 2004 10:43 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] postgres performance: comparing 2 data centers I have two instances of a production application that uses Postgres 7.2, deployed in two different data centers for about the last 6 months. The sizes, schemas, configurations, hardware, and access patterns of the two databases are nearly identical, but one consistently takes at least 5x longer than the other for some common operations. During this time, CPU usage and IO on the slow database are both high (sustained); I'm not sure about the fast database. These common operations are chatty - at least tens of thousands of queries over a 5 to 60 minute stretch - but the queries themselves are fairly simple. The query plans are identical across both databases, and the data distribution is comparable. The tables involved in these common operations change frequently, and are indexed according to these queries. The queries use the indexes as expected. The tables involved have 50k-500k rows. We 'vacuum analyze' nightly, and we recently rebuilt the indexes on the slow database (using reindex table). This cut the number of index pages dramatically: from ~1800 to ~50, but didn't noticeably change the time or CPU utilization for the common operations described above. When running pgbench, both databases have very similar results (200-260 over multiple runs with 5 concurrent threads). I know of a few things I can do to make this operation marginally simpler, but I'm most interested in the difference between the two databases. I haven't come up with a theory that explains each of these things. What are some things I can look into to track this down further? mike ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
pgsql-performance by date: