Re: Very newbie question - Mailing list pgsql-general

From Peter J. Holzer
Subject Re: Very newbie question
Date
Msg-id 20231026112844.r75oxsxz4tlybij5@hjp.at
Whole thread Raw
In response to Re: Very newbie question  (Olivier Gautherot <ogautherot@gautherot.net>)
List pgsql-general
On 2023-10-26 11:56:56 +0200, Olivier Gautherot wrote:
> El jue, 26 oct 2023 11:15, Peter J. Holzer <hjp-pgsql@hjp.at> escribió:
>     On 2023-10-25 17:48:46 +0200, Olivier Gautherot wrote:
>     > El mié, 25 oct 2023 16:58, Олег Самойлов <splarv@ya.ru> escribió:
>     >     Okey, I see no one was be able to solve this problem. But I could.
>     >     May be
>     >     for someone this will be useful too. There is solution.
>     [...]
>     >     Now query is:
>     >
>     >     SELECT generate_series(min(id)/10000000, max(id)/10000000) AS n FROM
>     >     delivery) as part_numbers
>     >              WHERE (SELECT max(created_at) from delivery where n*10000000
>     <=id
>     >     and id < (n+1)*10000000)
>     >                 < CURRENT_DATE-'3 month'::interval;

I just realized that this query is mangled. I'm going to assume that it
should have been something like

with part_numbers as (
    SELECT generate_series(min(id)/100, max(id)/100) as n
    from delivery
)
select * from part_numbers
WHERE (SELECT max(created_at) from delivery where n*100 <= id and id < (n+1)*100) < CURRENT_DATE-'3 month'::interval;

>     [...]
>     > Your fast solution will work as long as you don't have missing sequences
>     (like
>     > deleted rows).
>
>     Why do you think this would break with missing sequence numbers?
>
>
> In the suggested query, the return value contains a list of sequential numbers
> from a min to a max - they seem to be markers of the partitions. Let's assume
> that a complete partition is deleted in the middle: its index will still be
> returned by the query, although it doesn't exist any more in the table.

I don't think it will. While the generate_series() will produce the
partition number, the where clause will not find any matching rows, so
the query will not return it.

E.g. (this table isn't partitioned, but that shouldn't affect the
result, also I'll reduce the "partition size" to 100 to make it more
readable):

create table delivery (id int, created_at date);
insert into delivery(200, '2000-01-01');
insert into delivery values(200, '2000-01-01');
insert into delivery values(299, '2000-12-01');
insert into delivery values(412, '2002-02-01');
insert into delivery values(439, '2002-03-01');
insert into delivery values(501, '2023-01-01');
insert into delivery values(555, now());

Note that there are no records in "partition" 3, and "partition" 5
contains current data, so we should get only "partition numbers" 2 and
4:

with part_numbers as (
    SELECT generate_series(min(id)/100, max(id)/100) as n
    from delivery
)
select * from part_numbers
WHERE (SELECT max(created_at) from delivery where n*100 <= id and id < (n+1)*100) < CURRENT_DATE-'3 month'::interval;

╔═══╗
║ n ║
╟───╢
║ 2 ║
║ 4 ║
╚═══╝
(2 rows)

Looks ok to me.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

pgsql-general by date:

Previous
From: Shaozhong SHI
Date:
Subject: PgAmin view
Next
From: Rozad Khalaf
Date:
Subject: Introducing PostgresFly: A New PostgreSQL Installation Tool