Thread: Using indexes through a plpgsql procedure

Using indexes through a plpgsql procedure

From
Ciaran Doherty
Date:
Hello,


Thank you for taking the time to read this.

As a quick question. Can Postgres make use of indexes on a table when the data from that table is being returned from a procedure?

Some more details. I have a big table which has a long running query running on it. This table gets rebuilt from scratch on a weekly basis. Currently, I cannot update/replace the table in a timely manner, I have to wait for the long running queries to finish before we can update the table (the total time can be 5+ hours).

I am currently trying to build a procedure which will return data from the latest version of the table (by dynamically choosing the table name, something like
https://stackoverflow.com/questions/35559093/how-to-use-variable-as-table-name-in-plpgsql)  . To get a consistent return type I am having to use the `create type`  to build a custom return type. The problem (I think) I am having is that the indexes on the underlying tables are not usable after calling this procedure

(e.g.
```
select *
  from example_table as et
    join example_procedure() as ep on et.exapmle_column = ep.example_column
```
there will be no index on the column ep.example_column)

Is there a way to make sure indexes are used even if the data is accessed threw a procdure?

Thank you

--
while (e) { kyoatie(); }

Re: Using indexes through a plpgsql procedure

From
"Peter J. Holzer"
Date:
On 2021-04-07 12:15:37 +0100, Ciaran Doherty wrote:
> As a quick question. Can Postgres make use of indexes on a table when the data
> from that table is being returned from a procedure?
>
> Some more details. I have a big table which has a long running query running on
> it. This table gets rebuilt from scratch on a weekly basis. Currently, I cannot
> update/replace the table in a timely manner, I have to wait for the long
> running queries to finish before we can update the table (the total time can be
> 5+ hours).
>
> I am currently trying to build a procedure which will return data from the
> latest version of the table (by dynamically choosing the table name, something
> like
> https://stackoverflow.com/questions/35559093/
> how-to-use-variable-as-table-name-in-plpgsql)

So you are creating a new table for each update and drop older tables
after some time? And your procedure just returns data from the newest
table?


>   . To get a consistent return type I am having to use the `create
> type`  to build a custom return type. The problem (I think) I am
> having is that the indexes on the underlying tables are not usable
> after calling this procedure
>
> (e.g.
> ```
> select *
>   from example_table as et
>     join example_procedure() as ep on et.exapmle_column = ep.example_column
> ```
> there will be no index on the column ep.example_column)
>
> Is there a way to make sure indexes are used even if the data is accessed threw
> a procdure?

Any query within example_procedure() will be able to use an index. and
the join will in theory be able to use an index on example_table, although it
can't really get useful statistics (it doesn't know what fraction of the
example_table will be covered by the output of example_procedure()), so
it might default to doing a full scan anyway.

I would probably use a view for this: After creating the new table, just
recreate the view to use the new table.

Or maybe you don't even need the view and can get away by renaming the
old and new table:

begin;
    create table ep_new(...);
    -- populate ep_new here
    drop table if exists ep_old;
    alter table ep rename to ep_old;
    alter table ep_new rename to ep;
commit;

Partitioning should also work but that feels like a hack.

        hp


--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment