Re: Very newbie question - Mailing list pgsql-general

From Olivier Gautherot
Subject Re: Very newbie question
Date
Msg-id CAJ7S9TX4z3cOvD4PVFFH73qC2qwr+o1QoF=ERLVQJpy0u+De5A@mail.gmail.com
Whole thread Raw
In response to Re: Very newbie question  (Олег Самойлов <splarv@ya.ru>)
Responses Re: Very newbie question  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
List pgsql-general
Hi,

El mié, 25 oct 2023 16:58, Олег Самойлов <splarv@ya.ru> escribió:
Okey, I see no one was be able to solve this problem. But I could. May be for someone this will be useful too. There is solution.

Original query was:

> 23 окт. 2023 г., в 18:13, Олег Самойлов <splarv@ya.ru> написал(а):
>
> SELECT id/10000000 as partition
>   FROM delivery
>   GROUP BY partition
>   HAVING max(created_at) < CURRENT_DATE - '3 month'::interval;

And I was not able to accelerate it by any index, works 5 minutes. Now query is:

SELECT generate_series(min(id)/10000000, max(id)/10000000) AS n FROM delivery) as part_numbers
         WHERE (SELECT max(created_at) from delivery where n*10000000 <=id and id < (n+1)*10000000)
            < CURRENT_DATE-'3 month'::interval;

Return the same (number of partition need to archive), accelerated by two btree index: on id and created_at. Works very quick, less then second.

If you happen to rework your design, consider partitioning on (created_at), as it may simplify your maintenance.

The reason why you couldn't improve the performance with an index is due to the calls of min() and max() that force to evaluate every single row. You may consider using a computed index in this case.

Your fast solution will work as long as you don't have missing sequences (like deleted rows).

Regards
Olivier

pgsql-general by date:

Previous
From: John W Higgins
Date:
Subject: Re: Question about the new PostgreSQL 16 availability on Ubuntu
Next
From: Ray O'Donnell
Date:
Subject: Re: Question about the new PostgreSQL 16 availability on Ubuntu