oracle to psql migration - slow query in postgres - Mailing list pgsql-performance

From Tony Capobianco
Subject oracle to psql migration - slow query in postgres
Date
Msg-id 1287085384.1684.52.camel@tony1.localdomain
Whole thread Raw
Responses Re: oracle to psql migration - slow query in postgres  (bricklen <bricklen@gmail.com>)
Re: oracle to psql migration - slow query in postgres  (Ivan Voras <ivoras@freebsd.org>)
Re: oracle to psql migration - slow query in postgres  (Cédric Villemain <cedric.villemain.debian@gmail.com>)
Re: oracle to psql migration - slow query in postgres  ("Pierre C" <lists@peufeu.com>)
Re: oracle to psql migration - slow query in postgres  (Jon Nelson <jnelson+pgsql@jamponi.net>)
Re: oracle to psql migration - slow query in postgres  (Merlin Moncure <mmoncure@gmail.com>)
Re: oracle to psql migration - slow query in postgres  ("Igor Neyman" <ineyman@perceptron.com>)
List pgsql-performance
We are in the process of testing migration of our oracle data warehouse
over to postgres.  A potential showstopper are full table scans on our
members table.  We can't function on postgres effectively unless index
scans are employed.  I'm thinking I don't have something set correctly
in my postgresql.conf file, but I'm not sure what.

This table has approximately 300million rows.

Version:
SELECT version();

version
------------------------------------------------------------------------------------------------------------------
 PostgreSQL 8.4.2 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC)
4.1.2 20071124 (Red Hat 4.1.2-42), 64-bit

We have 4 quad-core processors and 32GB of RAM.  The below query uses
the members_sorted_idx_001 index in oracle, but in postgres, the
optimizer chooses a sequential scan.

explain analyze create table tmp_srcmem_emws1
as
select emailaddress, websiteid
  from members
 where emailok = 1
   and emailbounced = 0;
                                                          QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on members  (cost=0.00..14137154.64 rows=238177981 width=29)
(actual time=0.052..685834.785 rows=236660930 loops=1)
   Filter: ((emailok = 1::numeric) AND (emailbounced = 0::numeric))
 Total runtime: 850306.220 ms
(3 rows)

show shared_buffers ;
 shared_buffers
----------------
 7680MB
(1 row)

show effective_cache_size ;
 effective_cache_size
----------------------
 22GB
(1 row)

show work_mem ;
 work_mem
----------
 768MB
(1 row)

show enable_seqscan ;
 enable_seqscan
----------------
 on
(1 row)

Below are the data definitions for the table/indexes in question:

\d members
                     Table "members"
       Column        |            Type             | Modifiers
---------------------+-----------------------------+-----------
 memberid            | numeric                     | not null
 firstname           | character varying(50)       |
 lastname            | character varying(50)       |
 emailaddress        | character varying(50)       |
 password            | character varying(50)       |
 address1            | character varying(50)       |
 address2            | character varying(50)       |
 city                | character varying(50)       |
 statecode           | character varying(50)       |
 zipcode             | character varying(50)       |
 birthdate           | date                        |
 emailok             | numeric(2,0)                |
 gender              | character varying(1)        |
 addeddate           | timestamp without time zone |
 emailbounced        | numeric(2,0)                |
 changedate          | timestamp without time zone |
 optoutsource        | character varying(100)      |
 websiteid           | numeric                     |
 promotionid         | numeric                     |
 sourceid            | numeric                     |
 siteid              | character varying(64)       |
 srcwebsiteid        | numeric                     |
 homephone           | character varying(20)       |
 homeareacode        | character varying(10)       |
 campaignid          | numeric                     |
 srcmemberid         | numeric                     |
 optoutdate          | date                        |
 regcomplete         | numeric(1,0)                |
 regcompletesourceid | numeric                     |
 ipaddress           | character varying(25)       |
 pageid              | numeric                     |
 streetaddressstatus | numeric(1,0)                |
 middlename          | character varying(50)       |
 optinprechecked     | numeric(1,0)                |
 optinposition       | numeric                     |
 homephonestatus     | numeric                     |
 addeddate_id        | numeric                     |
 changedate_id       | numeric                     |
 rpmindex            | numeric                     |
 optmode             | numeric(1,0)                |
 countryid           | numeric                     |
 confirmoptin        | numeric(2,0)                |
 bouncedate          | date                        |
 memberageid         | numeric                     |
 sourceid2           | numeric                     |
 remoteuserid        | character varying(50)       |
 goal                | numeric(1,0)                |
 flowdepth           | numeric                     |
 pagetype            | numeric                     |
 savepassword        | character varying(50)       |
 customerprofileid   | numeric                     |
Indexes:
    "email_website_unq" UNIQUE, btree (emailaddress, websiteid),
tablespace "members_idx"
    "member_addeddateid_idx" btree (addeddate_id), tablespace
"members_idx"
    "member_changedateid_idx" btree (changedate_id), tablespace
"members_idx"
    "members_fdate_idx" btree (to_char_year_month(addeddate)),
tablespace "esave_idx"
    "members_memberid_idx" btree (memberid), tablespace "members_idx"
    "members_mid_emailok_idx" btree (memberid, emailaddress, zipcode,
firstname, emailok), tablespace "members_idx"
    "members_sorted_idx_001" btree (websiteid, emailok, emailbounced,
addeddate, memberid, zipcode, statecode, emailaddress), tablespace
"members_idx"
    "members_src_idx" btree (websiteid, emailbounced, sourceid),
tablespace "members_idx"
    "members_wid_idx" btree (websiteid), tablespace "members_idx"

select tablename, indexname, tablespace, indexdef from pg_indexes where
tablename = 'members';
-[ RECORD
1
]----------------------------------------------------------------------------------------------------------------------------------------------------
tablename  | members
indexname  | members_fdate_idx
tablespace | esave_idx
indexdef   | CREATE INDEX members_fdate_idx ON members USING btree
(to_char_year_month(addeddate))
-[ RECORD
2
]----------------------------------------------------------------------------------------------------------------------------------------------------
tablename  | members
indexname  | member_changedateid_idx
tablespace | members_idx
indexdef   | CREATE INDEX member_changedateid_idx ON members USING btree
(changedate_id)
-[ RECORD
3
]----------------------------------------------------------------------------------------------------------------------------------------------------
tablename  | members
indexname  | member_addeddateid_idx
tablespace | members_idx
indexdef   | CREATE INDEX member_addeddateid_idx ON members USING btree
(addeddate_id)
-[ RECORD
4
]----------------------------------------------------------------------------------------------------------------------------------------------------
tablename  | members
indexname  | members_wid_idx
tablespace | members_idx
indexdef   | CREATE INDEX members_wid_idx ON members USING btree
(websiteid)
-[ RECORD
5
]----------------------------------------------------------------------------------------------------------------------------------------------------
tablename  | members
indexname  | members_src_idx
tablespace | members_idx
indexdef   | CREATE INDEX members_src_idx ON members USING btree
(websiteid, emailbounced, sourceid)
-[ RECORD
6
]----------------------------------------------------------------------------------------------------------------------------------------------------
tablename  | members
indexname  | members_sorted_idx_001
tablespace | members_idx
indexdef   | CREATE INDEX members_sorted_idx_001 ON members USING btree
(websiteid, emailok, emailbounced, addeddate, memberid, zipcode,
statecode, emailaddress)
-[ RECORD
7
]----------------------------------------------------------------------------------------------------------------------------------------------------
tablename  | members
indexname  | members_mid_emailok_idx
tablespace | members_idx
indexdef   | CREATE INDEX members_mid_emailok_idx ON members USING btree
(memberid, emailaddress, zipcode, firstname, emailok)
-[ RECORD
8
]----------------------------------------------------------------------------------------------------------------------------------------------------
tablename  | members
indexname  | members_memberid_idx
tablespace | members_idx
indexdef   | CREATE INDEX members_memberid_idx ON members USING btree
(memberid)
-[ RECORD
9
]----------------------------------------------------------------------------------------------------------------------------------------------------
tablename  | members
indexname  | email_website_unq
tablespace | members_idx
indexdef   | CREATE UNIQUE INDEX email_website_unq ON members USING
btree (emailaddress, websiteid)


This table has also been vacuumed analyzed as well:

select * from pg_stat_all_tables where relname = 'members';
-[ RECORD 1 ]----+------------------------------
relid            | 3112786
schemaname       | xxxxx
relname          | members
seq_scan         | 298
seq_tup_read     | 42791828896
idx_scan         | 31396925
idx_tup_fetch    | 1083796963
n_tup_ins        | 291308316
n_tup_upd        | 0
n_tup_del        | 4188020
n_tup_hot_upd    | 0
n_live_tup       | 285364632
n_dead_tup       | 109658
last_vacuum      | 2010-10-12 20:26:01.227393-04
last_autovacuum  |
last_analyze     | 2010-10-12 20:28:01.105656-04
last_autoanalyze | 2010-09-16 20:50:00.712418-04



pgsql-performance by date:

Previous
From: Cédric Villemain
Date:
Subject: Re: How does PG know if data is in memory?
Next
From: Robert Haas
Date:
Subject: Re: Slow count(*) again...