Re: PoC: Partial sort - Mailing list pgsql-hackers

From Marti Raudsepp
Subject Re: PoC: Partial sort
Date
Msg-id CABRT9RAtg=VE9G8tS=1Zttep867GY51Yd_wA9BKuk9+6LDwh+A@mail.gmail.com
Whole thread Raw
In response to Re: PoC: Partial sort  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: PoC: Partial sort  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, Jan 28, 2014 at 7:51 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Tue, Jan 28, 2014 at 7:41 AM, Marti Raudsepp <marti@juffo.org> wrote:
But some benchmarks of planning performance are certainly warranted.

I didn't test it, but I worry that overhead might be high.
If it's true then it could be like constraint_exclusion option which id off by default because of planning overhead.

Sorry I didn't get around to this before.

I ran some synthetic benchmarks with single-column inner joins between 5 tables, with indexes on both joined columns, using only EXPLAIN (so measuring planning time, not execution) in 9 scenarios to excercise different code paths. According to these measurements, the overhead ranges between 1.0 and 4.5% depending on the scenario.

----
Merge join with partial sort children seems like a fairly obscure use case (though I'm sure it can help a lot in those cases). The default should definitely allow partial sort in normal ORDER BY queries. What's under question here is whether to enable partial sort for mergejoin.

So I see 3 possible resolutions:
1. The overhead is deemed acceptable to enable by default, in which case we're done here.
2. Add a three-value runtime setting like: enable_partialsort = [ off | no_mergejoin | on ], defaulting to no_mergejoin (just to get the point across, clearly we need better naming). This is how constraint_exclusion works.
3. Remove the partialsort mergejoin code entirely, keeping the rest of the cases.

What do you think?

----
All the tests are available here: https://github.com/intgr/benchjunk/tree/master/partial_sort (using script run2.sh)

Overhead by test (partial-sort-7.patch.gz):
join5.sql 2.9% (all joins on the same column)
star5.sql 1.7% ("star schema" kind of join)
line5.sql 1.9% (joins chained to each other)
lim_join5.sql 4.5% (same as above, with LIMIT 1)
lim_star5.sql 2.8%
lim_line5.sql 1.8%
limord_join5.sql 4.3% (same as above, with ORDER BY & LIMIT 1)
limord_star5.sql 3.9%
limord_line5.sql 1.0%

Full data:
PostgreSQL @ git ac8bc3b
join5.sql tps = 499.490173 (excluding connections establishing)
join5.sql tps = 503.756335 (excluding connections establishing)
join5.sql tps = 504.814072 (excluding connections establishing)
star5.sql tps = 492.799230 (excluding connections establishing)
star5.sql tps = 492.570615 (excluding connections establishing)
star5.sql tps = 491.949985 (excluding connections establishing)
line5.sql tps = 773.945050 (excluding connections establishing)
line5.sql tps = 773.858068 (excluding connections establishing)
line5.sql tps = 774.551240 (excluding connections establishing)
lim_join5.sql tps = 392.539745 (excluding connections establishing)
lim_join5.sql tps = 391.867549 (excluding connections establishing)
lim_join5.sql tps = 393.361655 (excluding connections establishing)
lim_star5.sql tps = 418.431804 (excluding connections establishing)
lim_star5.sql tps = 419.258985 (excluding connections establishing)
lim_star5.sql tps = 419.434697 (excluding connections establishing)
lim_line5.sql tps = 713.852506 (excluding connections establishing)
lim_line5.sql tps = 713.636694 (excluding connections establishing)
lim_line5.sql tps = 712.971719 (excluding connections establishing)
limord_join5.sql tps = 381.068465 (excluding connections establishing)
limord_join5.sql tps = 380.379359 (excluding connections establishing)
limord_join5.sql tps = 381.182385 (excluding connections establishing)
limord_star5.sql tps = 412.997935 (excluding connections establishing)
limord_star5.sql tps = 411.401352 (excluding connections establishing)
limord_star5.sql tps = 413.209784 (excluding connections establishing)
limord_line5.sql tps = 688.906406 (excluding connections establishing)
limord_line5.sql tps = 689.445483 (excluding connections establishing)
limord_line5.sql tps = 688.758042 (excluding connections establishing)

partial-sort-7.patch.gz
join5.sql tps = 479.508034 (excluding connections establishing)
join5.sql tps = 488.263674 (excluding connections establishing)
join5.sql tps = 490.127433 (excluding connections establishing)
star5.sql tps = 482.106063 (excluding connections establishing)
star5.sql tps = 484.179687 (excluding connections establishing)
star5.sql tps = 483.027372 (excluding connections establishing)
line5.sql tps = 758.092993 (excluding connections establishing)
line5.sql tps = 759.697814 (excluding connections establishing)
line5.sql tps = 759.792792 (excluding connections establishing)
lim_join5.sql tps = 375.517211 (excluding connections establishing)
lim_join5.sql tps = 375.539109 (excluding connections establishing)
lim_join5.sql tps = 375.841645 (excluding connections establishing)
lim_star5.sql tps = 407.683110 (excluding connections establishing)
lim_star5.sql tps = 407.414409 (excluding connections establishing)
lim_star5.sql tps = 407.526613 (excluding connections establishing)
lim_line5.sql tps = 699.905101 (excluding connections establishing)
lim_line5.sql tps = 700.349675 (excluding connections establishing)
lim_line5.sql tps = 700.661762 (excluding connections establishing)
limord_join5.sql tps = 364.607236 (excluding connections establishing)
limord_join5.sql tps = 364.367705 (excluding connections establishing)
limord_join5.sql tps = 363.694065 (excluding connections establishing)
limord_star5.sql tps = 397.036792 (excluding connections establishing)
limord_star5.sql tps = 397.197359 (excluding connections establishing)
limord_star5.sql tps = 395.797940 (excluding connections establishing)
limord_line5.sql tps = 680.907397 (excluding connections establishing)
limord_line5.sql tps = 682.206481 (excluding connections establishing)
limord_line5.sql tps = 681.210267 (excluding connections establishing)

Regards,
Marti

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: jsonb and nested hstore
Next
From: Peter Geoghegan
Date:
Subject: Re: Failure while inserting parent tuple to B-tree is not fun