Re: oracle to psql migration - slow query in postgres - Mailing list pgsql-performance

From Tony Capobianco
Subject Re: oracle to psql migration - slow query in postgres
Date
Msg-id 1287166429.1730.170.camel@tony1.localdomain
Whole thread Raw
In response to Re: oracle to psql migration - slow query in postgres  ("Igor Neyman" <ineyman@perceptron.com>)
Responses Re: oracle to psql migration - slow query in postgres
List pgsql-performance
Thanks for all your responses. What's interesting is that an index is
used when this query is executed in Oracle.  It appears to do some
parallel processing:

SQL> set line 200
delete from plan_table;
explain plan for
select websiteid, emailaddress
  from members
 where emailok = 1
   and emailbounced = 0;

SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());
SQL>
3 rows deleted.

SQL>   2    3    4    5
Explained.

SQL> SQL>
PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4247959398


-------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name                   | Rows  | Bytes
| Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |

-------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                        |   237M|
7248M|   469K  (2)| 01:49:33 |        |      |            |
|   1 |  PX COORDINATOR         |                        |       |
|            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)   | :TQ10000               |   237M|
7248M|   469K  (2)| 01:49:33 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR    |                        |   237M|
7248M|   469K  (2)| 01:49:33 |  Q1,00 | PCWC |            |
|*  4 |     INDEX FAST FULL SCAN| MEMBERS_SORTED_IDX_001 |   237M|
7248M|   469K  (2)| 01:49:33 |  Q1,00 | PCWP |            |

-------------------------------------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("EMAILBOUNCED"=0 AND "EMAILOK"=1)

16 rows selected.


On Fri, 2010-10-15 at 13:43 -0400, Igor Neyman wrote:
>
> > -----Original Message-----
> > From: Tony Capobianco [mailto:tcapobianco@prospectiv.com]
> > Sent: Thursday, October 14, 2010 3:43 PM
> > To: pgsql-performance@postgresql.org
> > Subject: oracle to psql migration - slow query in postgres
> >
> > We are in the process of testing migration of our oracle data
> > warehouse over to postgres.  A potential showstopper are full
> > table scans on our members table.  We can't function on
> > postgres effectively unless index scans are employed.  I'm
> > thinking I don't have something set correctly in my
> > postgresql.conf file, but I'm not sure what.
> >
> > This table has approximately 300million rows.
> >
> > Version:
> > SELECT version();
> >
> > version
> > --------------------------------------------------------------
> > ----------------------------------------------------
> >  PostgreSQL 8.4.2 on x86_64-redhat-linux-gnu, compiled by GCC
> > gcc (GCC)
> > 4.1.2 20071124 (Red Hat 4.1.2-42), 64-bit
> >
> > We have 4 quad-core processors and 32GB of RAM.  The below
> > query uses the members_sorted_idx_001 index in oracle, but in
> > postgres, the optimizer chooses a sequential scan.
> >
> > explain analyze create table tmp_srcmem_emws1 as select
> > emailaddress, websiteid
> >   from members
> >  where emailok = 1
> >    and emailbounced = 0;
> >                                                           QUERY
> > PLAN
> > --------------------------------------------------------------
> > ----------------------------------------------------------------
> >  Seq Scan on members  (cost=0.00..14137154.64 rows=238177981
> > width=29) (actual time=0.052..685834.785 rows=236660930 loops=1)
> >    Filter: ((emailok = 1::numeric) AND (emailbounced =
> > 0::numeric))  Total runtime: 850306.220 ms
> > (3 rows)
> >
> > show shared_buffers ;
> >  shared_buffers
> > ----------------
> >  7680MB
> > (1 row)
> >
> > show effective_cache_size ;
> >  effective_cache_size
> > ----------------------
> >  22GB
> > (1 row)
> >
> > show work_mem ;
> >  work_mem
> > ----------
> >  768MB
> > (1 row)
> >
> > show enable_seqscan ;
> >  enable_seqscan
> > ----------------
> >  on
> > (1 row)
> >
> > Below are the data definitions for the table/indexes in question:
> >
> > \d members
> >                      Table "members"
> >        Column        |            Type             | Modifiers
> > ---------------------+-----------------------------+-----------
> >  memberid            | numeric                     | not null
> >  firstname           | character varying(50)       |
> >  lastname            | character varying(50)       |
> >  emailaddress        | character varying(50)       |
> >  password            | character varying(50)       |
> >  address1            | character varying(50)       |
> >  address2            | character varying(50)       |
> >  city                | character varying(50)       |
> >  statecode           | character varying(50)       |
> >  zipcode             | character varying(50)       |
> >  birthdate           | date                        |
> >  emailok             | numeric(2,0)                |
> >  gender              | character varying(1)        |
> >  addeddate           | timestamp without time zone |
> >  emailbounced        | numeric(2,0)                |
> >  changedate          | timestamp without time zone |
> >  optoutsource        | character varying(100)      |
> >  websiteid           | numeric                     |
> >  promotionid         | numeric                     |
> >  sourceid            | numeric                     |
> >  siteid              | character varying(64)       |
> >  srcwebsiteid        | numeric                     |
> >  homephone           | character varying(20)       |
> >  homeareacode        | character varying(10)       |
> >  campaignid          | numeric                     |
> >  srcmemberid         | numeric                     |
> >  optoutdate          | date                        |
> >  regcomplete         | numeric(1,0)                |
> >  regcompletesourceid | numeric                     |
> >  ipaddress           | character varying(25)       |
> >  pageid              | numeric                     |
> >  streetaddressstatus | numeric(1,0)                |
> >  middlename          | character varying(50)       |
> >  optinprechecked     | numeric(1,0)                |
> >  optinposition       | numeric                     |
> >  homephonestatus     | numeric                     |
> >  addeddate_id        | numeric                     |
> >  changedate_id       | numeric                     |
> >  rpmindex            | numeric                     |
> >  optmode             | numeric(1,0)                |
> >  countryid           | numeric                     |
> >  confirmoptin        | numeric(2,0)                |
> >  bouncedate          | date                        |
> >  memberageid         | numeric                     |
> >  sourceid2           | numeric                     |
> >  remoteuserid        | character varying(50)       |
> >  goal                | numeric(1,0)                |
> >  flowdepth           | numeric                     |
> >  pagetype            | numeric                     |
> >  savepassword        | character varying(50)       |
> >  customerprofileid   | numeric                     |
> > Indexes:
> >     "email_website_unq" UNIQUE, btree (emailaddress,
> > websiteid), tablespace "members_idx"
> >     "member_addeddateid_idx" btree (addeddate_id), tablespace
> > "members_idx"
> >     "member_changedateid_idx" btree (changedate_id),
> > tablespace "members_idx"
> >     "members_fdate_idx" btree
> > (to_char_year_month(addeddate)), tablespace "esave_idx"
> >     "members_memberid_idx" btree (memberid), tablespace "members_idx"
> >     "members_mid_emailok_idx" btree (memberid, emailaddress,
> > zipcode, firstname, emailok), tablespace "members_idx"
> >     "members_sorted_idx_001" btree (websiteid, emailok,
> > emailbounced, addeddate, memberid, zipcode, statecode,
> > emailaddress), tablespace "members_idx"
> >     "members_src_idx" btree (websiteid, emailbounced,
> > sourceid), tablespace "members_idx"
> >     "members_wid_idx" btree (websiteid), tablespace "members_idx"
> >
> > select tablename, indexname, tablespace, indexdef from
> > pg_indexes where tablename = 'members'; -[ RECORD
> > 1
> > ]-------------------------------------------------------------
> > --------------------------------------------------------------
> > -------------------------
> > tablename  | members
> > indexname  | members_fdate_idx
> > tablespace | esave_idx
> > indexdef   | CREATE INDEX members_fdate_idx ON members USING btree
> > (to_char_year_month(addeddate))
> > -[ RECORD
> > 2
> > ]-------------------------------------------------------------
> > --------------------------------------------------------------
> > -------------------------
> > tablename  | members
> > indexname  | member_changedateid_idx
> > tablespace | members_idx
> > indexdef   | CREATE INDEX member_changedateid_idx ON members
> > USING btree
> > (changedate_id)
> > -[ RECORD
> > 3
> > ]-------------------------------------------------------------
> > --------------------------------------------------------------
> > -------------------------
> > tablename  | members
> > indexname  | member_addeddateid_idx
> > tablespace | members_idx
> > indexdef   | CREATE INDEX member_addeddateid_idx ON members
> > USING btree
> > (addeddate_id)
> > -[ RECORD
> > 4
> > ]-------------------------------------------------------------
> > --------------------------------------------------------------
> > -------------------------
> > tablename  | members
> > indexname  | members_wid_idx
> > tablespace | members_idx
> > indexdef   | CREATE INDEX members_wid_idx ON members USING btree
> > (websiteid)
> > -[ RECORD
> > 5
> > ]-------------------------------------------------------------
> > --------------------------------------------------------------
> > -------------------------
> > tablename  | members
> > indexname  | members_src_idx
> > tablespace | members_idx
> > indexdef   | CREATE INDEX members_src_idx ON members USING btree
> > (websiteid, emailbounced, sourceid)
> > -[ RECORD
> > 6
> > ]-------------------------------------------------------------
> > --------------------------------------------------------------
> > -------------------------
> > tablename  | members
> > indexname  | members_sorted_idx_001
> > tablespace | members_idx
> > indexdef   | CREATE INDEX members_sorted_idx_001 ON members
> > USING btree
> > (websiteid, emailok, emailbounced, addeddate, memberid,
> > zipcode, statecode, emailaddress) -[ RECORD
> > 7
> > ]-------------------------------------------------------------
> > --------------------------------------------------------------
> > -------------------------
> > tablename  | members
> > indexname  | members_mid_emailok_idx
> > tablespace | members_idx
> > indexdef   | CREATE INDEX members_mid_emailok_idx ON members
> > USING btree
> > (memberid, emailaddress, zipcode, firstname, emailok) -[ RECORD
> > 8
> > ]-------------------------------------------------------------
> > --------------------------------------------------------------
> > -------------------------
> > tablename  | members
> > indexname  | members_memberid_idx
> > tablespace | members_idx
> > indexdef   | CREATE INDEX members_memberid_idx ON members USING btree
> > (memberid)
> > -[ RECORD
> > 9
> > ]-------------------------------------------------------------
> > --------------------------------------------------------------
> > -------------------------
> > tablename  | members
> > indexname  | email_website_unq
> > tablespace | members_idx
> > indexdef   | CREATE UNIQUE INDEX email_website_unq ON members USING
> > btree (emailaddress, websiteid)
> >
> >
> > This table has also been vacuumed analyzed as well:
> >
> > select * from pg_stat_all_tables where relname = 'members';
> > -[ RECORD 1 ]----+------------------------------
> > relid            | 3112786
> > schemaname       | xxxxx
> > relname          | members
> > seq_scan         | 298
> > seq_tup_read     | 42791828896
> > idx_scan         | 31396925
> > idx_tup_fetch    | 1083796963
> > n_tup_ins        | 291308316
> > n_tup_upd        | 0
> > n_tup_del        | 4188020
> > n_tup_hot_upd    | 0
> > n_live_tup       | 285364632
> > n_dead_tup       | 109658
> > last_vacuum      | 2010-10-12 20:26:01.227393-04
> > last_autovacuum  |
> > last_analyze     | 2010-10-12 20:28:01.105656-04
> > last_autoanalyze | 2010-09-16 20:50:00.712418-04
> >
> >
>
>
> Tony,
> For your query:
>
> > select
> > emailaddress, websiteid
> >   from members
> >  where emailok = 1
> >    and emailbounced = 0;
>
> your table doesn't have any indexes where "emailok" or "emailbounced"
> are leading columns.
> That's why existing indexes can not be used.
>
> If you specified "websiteid" in the "where" clause then (most probably)
> the index members_sorted_idx_001 will be used (based on selectivity and
> statistics known to optimizer).
>
> If this query (as is - without "websiteid") is important for your app,
> create another index on (emailok, emailbounced) which should help, of
> course if selectivity of your where clause is good enough (not to
> perform full table scan).
>
> Regards,
> Igor Neyman
>



pgsql-performance by date:

Previous
From: "Igor Neyman"
Date:
Subject: Re: oracle to psql migration - slow query in postgres
Next
From: Tom Lane
Date:
Subject: Re: Index scan / Index cond limitation or ?