Re: Index no longer being used, destroying and recreating it restores use. - Mailing list pgsql-general

From Koen De Groote
Subject Re: Index no longer being used, destroying and recreating it restores use.
Date
Msg-id CAGbX52FKg9a+GDds8j34VSeEBy0mXAeK75u-6ufHX6sTBPTsFg@mail.gmail.com
Whole thread Raw
In response to Re: Index no longer being used, destroying and recreating it restores use.  (Koen De Groote <kdg.dev@gmail.com>)
Responses Re: Index no longer being used, destroying and recreating it restores use.  (Michael Lewis <mlewis@entrata.com>)
List pgsql-general
So, this query:

select * from item where shouldbebackedup=true and itemCreated<='2020-06-05 00:00:00.000' and backupperformed=false order by filepath asc, id asc limit 100 offset 10400;

Was made into a function:

create or replace function NeedsBackup(text, int, int default 100)
returns setof item as $$
BEGIN
    return query select * from item where shouldbebackedup=true and itemCreated<=$1::timestamp without time zone and backupperformed=false order by filepath asc, id asc limit $3 offset $2;
END;
$$
language 'plpgsql';

Having read the documentation, I do think this is STABLE and not VOLATILE? While the external process performing the query will eventually update these rows, setting backupperformed to true instead of false, that doesn't happen within this function.

Doing an EXPLAIN ANALYZE yields this output:

                                                     QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------
 Function Scan on public.needsbackup  (cost=0.25..10.25 rows=1000 width=5275) (actual time=80.107..80.115 rows=100 loops=1)
   Output: id, shouldbebackedup, backupperformed, itemCreated, filepath
   Function Call: needsbackup('2020-06-08 12:00:00.016'::text, 2100, 100)
   Buffers: shared read=1572
 Planning Time: 0.199 ms
 Execution Time: 80.499 ms
(6 rows)

And the actual output is identical to the full query.

I'm kind of worried I'm not seeing any more details as to how it all worked. Yes, it's as fast as I hoped, but that also happened last time when I re-created the index. Fast for a few weeks, then suddenly not anymore.

I tried looking up how to create indexes on functions, but found nothing that could help, also the suggestion that this is not possible.

So, at this point, what should I still look at, I wonder? EXPLAIN says it did a function call, but what happened under the hood there?

And before I forget: thank you most kindly for the advice so far, to all involved.

Regards,
Koen


On Sun, Jun 7, 2020 at 12:45 AM Koen De Groote <kdg.dev@gmail.com> wrote:
I'll attempt this next week.

On Fri, Jun 5, 2020, 21:11 Michael Lewis <mlewis@entrata.com> wrote:
Those row estimates are pretty far off.

Standard indexes and partial indexes don't get custom statistics created on them, but functional indexes do. I wonder if a small function needs_backup( shouldbebackedup, backupperformed ) and an index created on that function would nicely alleviate the pain. I would expect PG12 and extended statistics of type MCVs would bypass the need for that work around though.

pgsql-general by date:

Previous
From: Thorsten Schöning
Date:
Subject: Re: Concurrenctly running CREATE TEMP TABLE IF NOT EXISTS [...] AS [...]
Next
From: Chris Stephens
Date:
Subject: troubleshooting postgresql ldap authentication