View based upon function won't use index on joins - Mailing list pgsql-performance

From Jonathan Foy
Subject View based upon function won't use index on joins
Date
Msg-id 4b46b5f00911190906w43554374i8c071ec18f96de2f@mail.gmail.com
Whole thread Raw
Responses Re: View based upon function won't use index on joins
List pgsql-performance
Hello,

I've inherited some very...interestingly...
designed tables, and am trying to figure out how to make them usable.  I've got an ugly hack in place, but it will not use an index properly, and I'm hoping someone will be able to point me in the right direction.

Production is running 8.1.3, but I'm testing in 8.3.3.  I know that's not good, but I'm seeing the exact same problem in both, so hopefully fixing it in one will fix the other.

All tables/functions/views are included at the bottom, somewhat truncated to reduce length/repetition.

The table in question (though not the only one with this problem) has a series of 24 column pairs per row, one holding a code and the other a value.  Any code/value combo could be populated in any of these fields (the codes identify the type of value).  The row is keyed into based upon an id number/qualifier pair.  So, for a single id number/qualifier, there can be from 0 to 24 populated pairs.  We need to go in for a single key and pull a list of all codes/values.  Hopefully that makes sense.

I created a set-returning function that would pull in the row for a specific number/qualifier combination, check each code to see if it was null/empty, and if not it would return a record containing the code/value.

For various reasons I needed to create a view based upon this.  Due to postgres not liking having set-returning pl/pgsql functions in select statements, the only way that I could get the view to work was to create a pl/sql wrapper that simply pulls the results of the prior pl/pgsql function.

I have the view working, and if I pull straight from the view it uses the index properly (on id_nbr, id_qfr).  However, if I try to join to another table, based upon the indexed fields, I get a sequential scan.  This is not ideal at all.  I know a lot of this is bad practice and ugly, but I need to get something that will work.

Any ideas?  I'm willing to rework any and all as far as views/functions are concerned, redesigning the tables is sadly not an option at this time.


Ugly table:

CREATE TABLE value_codes
(
  id_nbr integer NOT NULL,
  id_qfr character(1) NOT NULL,
  val_1_cd_1 character varying(30),
  val_1_amt_1 numeric(10,2),
  val_1_cd_2 character varying(30),
  val_1_amt_2 numeric(10,2),
  ...
  val_2_cd_12 character varying(30),
  val_2_amt_12 numeric(10,2),
  CONSTRAINT value_codes_pkey PRIMARY KEY (id_nbr, id_qfr)
)
WITH (
  OIDS=TRUE
);



Joined table:

CREATE TABLE main_table
(
  id_nbr integer NOT NULL,
  id_qfr character(1) NOT NULL,
  create_dt character(8),
  create_tm character(8),
  CONSTRAINT main_table_pkey PRIMARY KEY (id_nbr, id_qfr)
)
WITH (
  OIDS=FALSE
);

CREATE INDEX main_table_create_dt_index
  ON main_table
  USING btree
  (create_dt);



Initial function:

CREATE OR REPLACE FUNCTION get_value_codes(IN fun_id_nbr integer,
    IN fun_id_qfr character,
    OUT value_code character varying,
    OUT value_amount numeric)
  RETURNS SETOF record AS
$BODY$
declare
    current_row    record;
begin

    select    val_1_cd_1,
        val_1_amt_1,
        val_1_cd_2,
        val_1_amt_2,
        ...
        val_2_cd_12,
        val_2_amt_12
    into     current_row
    from     value_codes
    where   id_nbr = fun_id_nbr
        and id_qfr = fun_id_qfr;

    if
        current_row.val_1_cd_1 is not null
        and current_row.val_1_cd_1 != ''
    then
        value_code := current_row.val_1_cd_1;
        value_amount := current_row.val_1_amt_1;

        return next;
    end if;
    ...
    if
        current_row.val_2_cd_12 is not null
        and current_row.val_2_cd_12 != ''
    then
        value_code := current_row.val_2_cd_12;
        value_amount := current_row.val_2_amt_12;

        return next;
    end if;

    return;
end;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE
  COST 100
  ROWS 10;



Wrapper function:

CREATE OR REPLACE FUNCTION get_value_codes_wrapper(IN id_nbr integer,
    IN id_qfr character,
    OUT value_code character varying,
    OUT value_amount numeric)
  RETURNS SETOF record AS
$BODY$
    SELECT * FROM get_value_codes($1, $2);
$BODY$
  LANGUAGE 'sql' VOLATILE
  COST 100
  ROWS 10;



View:

CREATE OR REPLACE VIEW value_codes_view AS
 SELECT value_codes.id_nbr,
      value_codes.id_qfr,
      (get_value_codes_wrapper(value_codes.id_nbr, value_codes.id_qfr)).value_code AS value_code,
      (get_value_codes_wrapper(value_codes.id_nbr, value_codes.id_qfr)).value_amount AS value_amount
   FROM value_codes;



Simple query Explained:

explain analyze select * from value_codes_view where id_nbr >= 90000000;

Index Scan using value_codes_pkey on value_codes  (cost=0.00..128.72 rows=53 width=6) (actual time=17.593..172.031 rows=15 loops=1)
  Index Cond: (id_nbr >= 90000000)
Total runtime: 172.141 ms


Join query explained:

explain analyze select * from main_table, value_codes_view
where create_dt >= '20091001'
and main_table.id_nbr = value_codes_view.id_nbr
and main_table.id_qfr = value_codes_view.id_qfr;

Hash Join  (cost=24.38..312425.40 rows=1 width=97) (actual time=220062.607..220295.870 rows=1 loops=1)
  Hash Cond: ((value_codes.id_nbr = main_table.id_nbr) AND (value_codes.id_qfr = main_table.id_qfr))
  ->  Seq Scan on value_codes  (cost=0.00..297676.77 rows=535427 width=6) (actual time=15.846..219553.511 rows=138947 loops=1)
  ->  Hash  (cost=21.47..21.47 rows=194 width=24) (actual time=0.455..0.455 rows=53 loops=1)
        ->  Index Scan using main_table_create_dt_index on main_table  (cost=0.00..21.47 rows=194 width=24) (actual time=0.033..0.243 rows=53 loops=1)
              Index Cond: (create_dt >= '20091001'::bpchar)
Total runtime: 220296.173 ms

pgsql-performance by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: SSD + RAID
Next
From: Scott Marlowe
Date:
Subject: Re: SSD + RAID