On Fri, Apr 17, 2026 at 2:26 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>
> On 16/04/2026 19:58, Masahiko Sawada wrote:
> > On Thu, Apr 16, 2026 at 1:26 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
> >>
> >> On 16/04/2026 10:11, Masahiko Sawada wrote:
> >>> On Thu, Apr 16, 2026 at 12:13 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
> >>> -- Random TIDs test. The offset numbers are randomized and must be --
> >>> unique and ordered. INSERT INTO hideblocks (blockno) SELECT
> >>> do_set_block_offsets(blkno, array_agg(DISTINCT greatest((random() *
> >>> :maxoffset)::int, 1))::int2[]) FROM generate_series(1, 100)
> >>> num_offsets, generate_series(1000, 1100, 1) blkno GROUP BY blkno;
> >>
> >> Alright, I used an explicit sort in reverse order to make sure the test is
> >> stable. I usually create modules that may change different paths, costs, and
> >> orders, and using random can make things unpredictable. But for this specific
> >> test, I don't see any risk.
> >>
> >>>
> >>> While I agree that we need to sort the offset numbers, I think it
> >>> would be better to make sure the offset numbers in the array to be
> >>> sorted in a test_tidstore.sql file where required, instead of doing so
> >>> for all cases.
> >>
> >> I'm not sure I follow. Are you saying that do_set_block_offsets shouldn't sort
> >> the incoming offsets?
> >
> > No, I wanted to mean that if we sort the given array in
> > do_set_block_offsets() as the proposed patch does, we end up always
> > sorting arrays even if the sorting is no actually required (e.g., when
> > executing "SELECT do_set_block_offsets(1,
> > array[1,2,3,4,100]::int2[]);"). So an alternative idea to stabilize
> > the regression test would be to create a SQL function to return a list
> > of sorted offsets and use it where it's required. While the patch gets
> > a little bigger, It would also help simplify the tests somewhat by
> > removing the redundant codes. I've attached the patch for this idea.
>
> Ok. No objections. Both changes are just test routines registered by the
> test_tidstore module.
>
> I decided to add C code, mostly following the idea that we reuse examples from
> the Postgres codebase when writing our patches/extensions. An explicit
> demonstration of the sort contract on the TidStoreSetBlockOffsets() call might
> help developers who don't read function comments each time.
Understood. After more thoughts, I think your idea would be better.
One thing still unclear to me is in which situation the query inthe
test produces an array of unsorted offset numbers. While I understand
it's not guaranteed that the DISTINCT clause returns the sorted
result, doing DISTINCT in an aggregation function is using sort-based
deduplication. I'd like to confirm that the queries in the test could
end up producing the results that violate the assertion. Is it
possible to do that by changing GUC parameters or something?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com