Re: tuple radix sort - Mailing list pgsql-hackers
| From | Chao Li |
|---|---|
| Subject | Re: tuple radix sort |
| Date | |
| Msg-id | DDA507E1-56DB-4D76-8B81-41660ED932B9@gmail.com Whole thread Raw |
| In response to | tuple radix sort (John Naylor <johncnaylorls@gmail.com>) |
| Responses |
Re: tuple radix sort
|
| List | pgsql-hackers |
> On Oct 29, 2025, at 14:28, John Naylor <johncnaylorls@gmail.com> wrote: > > I suspect the challenge > will be multikey sorts when the first key has low cardinality. As you predicted, when the first key has very low cardinality, radix is a little bit slower. I built a test that proves that: ``` evantest=# drop table if exists test_multi; evantest=# create unlogged table test_multi (category int, name text); — first column has only 4 distinct values evantest=# insert into test_multi select (random() * 4)::int as category, md5(random()::text) || md5(random()::text) as namefrom generate_series(1, 1000000); evantest=# vacuum freeze test_multi; evantest=# select count(*) from test_multi; evantest=# set work_mem = '64MB’; evantest-# \timing on Timing is on. evantest=# set wip_radix_sort = 'off'; Time: 0.403 ms evantest=# \o /dev/null evantest=# select * from test_multi order by category, name; Time: 5607.336 ms (00:05.607) evantest=# select * from test_multi order by category, name; Time: 5703.555 ms (00:05.704) evantest=# select * from test_multi order by category, name; Time: 5692.644 ms (00:05.693) evantest=# set wip_radix_sort = 'on'; Time: 0.859 ms evantest=# select * from test_multi order by category, name; Time: 5822.979 ms (00:05.823) evantest=# select * from test_multi order by category, name; Time: 5881.256 ms (00:05.881) evantest=# select * from test_multi order by category, name; Time: 5976.351 ms (00:05.976) ``` Roughly 5% slower for this corner case. However, when I recreate the test table with high cardinality first column, wip_radix_sort seems still slower: ``` evantest=# \o evantest=# drop table if exists test_multi; DROP TABLE evantest=# create unlogged table test_multi (category int, name text); CREATE TABLE evantest=# insert into test_multi evantest-# select (random() * 1000000)::int as category, md5(random()::text) || md5(random()::text) as name from generate_series(1,1000000); INSERT 0 1000000 evantest=# vacuum freeze test_multi; VACUUM evantest=# select count(*) from test_multi; count --------- 1000000 (1 row) evantest=# select * from test_multi limit 5; category | name ----------+------------------------------------------------------------------ 607050 | c555126a5afea9f5ffe3880248c89944d211bc378f8c3b6d125b4360fe8619b7 843579 | 69b5a1dba76f52ff238566a3f88315a7425116d5d271fb54701b6e49d4afd8ce 106298 | a96e8674db219e12463ecdbb405b99c631767972e489093045c97976c17c6561 621860 | 5e6739ea9f533f9cdb0b8db76e3d4ce63be6b2b612c8aff06c4b80451f8f2edc 290110 | 56944320e5abd3a854fffdd185b969727e8d414448d440725a392cda4c6355c4 (5 rows) evantest=# \timing on Timing is on. evantest=# \o /dev/null evantest=# set wip_radix_sort = 'off'; Time: 0.904 ms evantest=# select * from test_multi limit 5; Time: 0.983 ms evantest=# select * from test_multi order by category, name; Time: 593.578 ms evantest=# select * from test_multi order by category, name; Time: 597.329 ms evantest=# select * from test_multi order by category, name; Time: 600.050 ms evantest=# set wip_radix_sort = 'on'; Time: 0.737 ms evantest=# select * from test_multi order by category, name; Time: 611.604 ms evantest=# select * from test_multi order by category, name; Time: 613.115 ms evantest=# select * from test_multi order by category, name; Time: 615.003 ms ``` This seems like a real regression. Then I tried to only sort on the first column, yes, now radix is faster: ``` evantest=# set wip_radix_sort = 'off’; evantest=# select * from test_multi order by category; Time: 445.498 ms evantest=# select * from test_multi order by category; Time: 451.834 ms evantest=# select * from test_multi order by category; Time: 454.531 ms evantest=# set wip_radix_sort = 'on'; Time: 0.329 ms evantest=# select * from test_multi order by category; Time: 402.829 ms evantest=# select * from test_multi order by category; Time: 408.014 ms evantest=# select * from test_multi order by category; Time: 415.340 ms evantest=# select * from test_multi order by category; Time: 413.969 ms ``` Hope the test helps. (The test was run a MacBook M4. ) Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
pgsql-hackers by date: