Re: tuple radix sort - Mailing list pgsql-hackers

From John Naylor
Subject Re: tuple radix sort
Date
Msg-id CANWCAZbrKH-sDvzqb5z8BEofpXyXuN42UCRzoQ8wT3CiqPATFg@mail.gmail.com
Whole thread Raw
In response to Re: tuple radix sort  (Chao Li <li.evan.chao@gmail.com>)
Responses Re: tuple radix sort
List pgsql-hackers
On Thu, Oct 30, 2025 at 8:56 AM Chao Li <li.evan.chao@gmail.com> wrote:

> I changed work_men to 1GB and reran the test. As the high cardinality data are still there, so I first reran with
data:

> With radix_sort on and off, execution time are almost the same.

Are you by chance running with asserts on? It's happened before, so I
have to make sure. That makes a big difference here because I disabled
diversion thresholds in assert builds so that regression tests (few
cases with large inputs) cover the paths I want, in addition to my
running a standalone stress test.

Speaking of tests, I forgot to mention that regression tests will fail
since in-place radix sort is an unstable sort, as qsort is as well,
but in a different way -- this is expected. In assert builds, the
patch verifies the sort after the fact with the standard comparator,
and will throw an error if it's wrong.

On Thu, Oct 30, 2025 at 9:19 AM Chao Li <li.evan.chao@gmail.com> wrote:
> I just quick went through the code change. I guess I need more time to understand the entire logic, but I find a typo
thatmight effect the tests: 
>
> ```
> +       Assert(last = first);
> ```
>
> “=“ should be “=="

Yes, you're quite right, thanks for spotting! I reran my stress test
that has randomly distributed NULLs and the assert still holds, so
nothing further to fix yet. The NULL partitioning part of the code
hasn't been well tested in its current form, and I may arrange things
so that that step and the datum conditioning step happen in a single
pass. I'm not yet sure if it's important enough to justify the
additional complexity.

--
John Naylor
Amazon Web Services



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Report bytes and transactions actually sent downtream
Next
From: Corey Huinker
Date:
Subject: Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)