Re: Tuning/performance question. - Mailing list pgsql-general
From | David Griffiths |
---|---|
Subject | Re: Tuning/performance question. |
Date | |
Msg-id | 007f01c385e2$2264f290$6501a8c0@griffiths2 Whole thread Raw |
In response to | Re: populate table with large csv file ("Dave [Hawk-Systems]" <dave@hawk-systems.com>) |
List | pgsql-general |
> I guess that this question has been discussed very often - but I cannot > remember why exactly. Is there a pointer to a technical explanation? Has > it something to do with MVCC? But ist it one of MVCC's benefits that we > can make a consistent online backup without archiving redo locks (Oracle > can't, DB2 can). Is DB2 slower than Oracle in such cases (count(*)) as > well? > > Workaround: > We can sometimes fake a bit to avoid such costly queries and set up a > trigger that calls a function that increases a counter in a separate > counter table. Then we are lightning-fast. > > But many users compain about PostgreSQL's poor count(*) performance, > that's true and can be critical when someone wants to replace another > database product by PostgreSQL. This is but one of many tests we're doing. The count(*) performance is not the deciding factor. This query was pulled from our production system, and I've extracted the exact tables and data from the production system to test. MySQL with MyISAM does in fact cheat on the count(*). InnoDB does not, however. The "explain" indicates that it's doing the work, and analyzing the tables dropped the cost of the query from .35 seconds to .20 seconds. Here's the same query, but selecting data (to test the databases ability to find a single row quicky): SELECT current_timestamp; SELECT company_name, address_1, address_2, address_3, city, address_list.state_province_id, state_province_short_desc, country_desc, zip_code, address_list.country_id, contact_info.email, commercial_entity.user_account_id, phone_num_1, phone_num_fax, website, boats_website FROM commercial_entity, country, user_account, address_list LEFT JOIN state_province ON address_list.state_province_id = state_province.state_province_id LEFT JOIN contact_info ON address_list.contact_info_id = contact_info.contact_info_id WHERE address_list.address_type_id = 101 AND commercial_entity.commercial_entity_id=225528 AND commercial_entity.commercial_entity_id = address_list.commercial_entity_id AND address_list.country_id = country.country_id AND commercial_entity.user_account_id = user_account.user_account_id AND user_account.user_role_id IN (101, 101); SELECT current_timestamp; Postgres takes about 33 seconds to get the row back. Here's the "EXPLAIN": Nested Loop (cost=0.00..64570.33 rows=1 width=385) -> Nested Loop (cost=0.00..64567.30 rows=1 width=361) -> Nested Loop (cost=0.00..64563.97 rows=1 width=349) Join Filter: ("outer".commercial_entity_id = "inner".commercial_entity_id) -> Index Scan using commercial_entity_pkey on commercial_entity (cost=0.00..5.05 rows=1 width=94) Index Cond: (commercial_entity_id = 225528::numeric) -> Materialize (cost=63343.66..63343.66 rows=97221 width=255) -> Merge Join (cost=0.00..63343.66 rows=97221 width=255) Merge Cond: ("outer".contact_info_id = "inner".contact_info_id) -> Nested Loop (cost=0.00..830457.52 rows=97221 width=222) Join Filter: ("outer".state_province_id = "inner".state_province_id) -> Index Scan using addr_list_ci_id_i on address_list (cost=0.00..586676.65 rows=97221 width=205) Filter: (address_type_id = 101::numeric) -> Seq Scan on state_province (cost=0.00..1.67 rows=67 width=17) -> Index Scan using contact_info_pkey on contact_info (cost=0.00..3366.76 rows=56435 width=33) -> Index Scan using user_account_pkey on user_account (cost=0.00..3.32 rows=1 width=12) Index Cond: ("outer".user_account_id = user_account.user_account_id) Filter: (user_role_id = 101::numeric) -> Index Scan using country_pkey on country (cost=0.00..3.01 rows=1 width=24) Index Cond: ("outer".country_id = country.country_id) (20 rows) David.
pgsql-general by date: