Re: Moving data from huge table slow, min() query on indexed column taking 38s - Mailing list pgsql-general

From Dimitrios Apostolou
Subject Re: Moving data from huge table slow, min() query on indexed column taking 38s
Date
Msg-id 68257d84-96e5-9dca-ff31-c9f45e356543@gmx.net
Whole thread Raw
In response to Moving data from huge table slow, min() query on indexed column taking 38s  (Dimitrios Apostolou <jimis@gmx.net>)
List pgsql-general
On Thu, 6 Jul 2023, Dimitrios Apostolou wrote:
> + Then I broke the command above in many small chunks
>
>       WITH rows AS (
>             DELETE FROM tbl_legacy AS t
>                 WHERE (partition_key_column < $1)
>                 RETURNING t.*
>        )  INSERT INTO tbl SELECT * FROM rows;
>       COMMIT;
>
>   I increase the parameter $1 and keep going in a loop.  At first this
>   goes OK, after one day though I notice that it has slowed down
>   significantly. My investigation shows that the simple query
>
>    SELECT min(partition_key_column) from tbl_legacy;
>
>   takes 38s, despite having an index on the column! A VACUUM fixes that,
>   so I guess the index has a ton of dead tuples. I guess autovacuum does
>   not do its job because the table is constantly busy.
>
>   Unfortunately VACUUM takes long (several hours) on this huge table, so I
>   can't add in the loop after the DELETE command.
>
>   Is there a better way to avoid the bloat in the loop above?  Why can't
>   the DELETE command update the index by pointing the minimum element
>   beyond the dead tuples?

Any comments on this one? It annoys me that a simple loop deteriorated so
much and kept filling the table with bloat. What is that VACUUM does that
DELETE can't do, to keep the index fresh?



pgsql-general by date:

Previous
From: Les
Date:
Subject: Re: invalid value for parameter "default_text_search_config": "public.pg"
Next
From: Joseph Koshakow
Date:
Subject: aclitem binary encoding