Re: Performance issues with parallelism and LIMIT - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Performance issues with parallelism and LIMIT |
Date | |
Msg-id | c3ab3c44-5fb3-4f9c-e5ab-d9af536927a7@enterprisedb.com Whole thread Raw |
In response to | Re: Performance issues with parallelism and LIMIT (David Geier <geidav.pg@gmail.com>) |
Responses |
Re: Performance issues with parallelism and LIMIT
|
List | pgsql-hackers |
On 2/20/23 19:18, David Geier wrote: > Hi, > > On 2/8/23 11:42, Tomas Vondra wrote: >> On 2/1/23 14:41, David Geier wrote: >> >> Yeah, this is a pretty annoying regression. We already can hit poor >> behavior when matching rows are not distributed uniformly in the tables >> (which is what LIMIT costing assumes), and this makes it more likely to >> hit similar issues. A bit like when doing many HTTP requests makes it >> more likely to hit at least one 99% outlier. > Are you talking about the use of ordering vs filtering indexes in > queries where there's both an ORDER BY and a filter present (e.g. using > an ordering index but then all rows passing the filter are at the end of > the table)? If not, can you elaborate a bit more on that and maybe give > an example. Yeah, roughly. I don't think the explicit ORDER BY is a requirement for this to happen - it's enough when the part of the plan below LIMIT produces many rows, but the matching rows are at the end. >> No opinion on these options, but worth a try. Alternatively, we could >> try the usual doubling approach - start with a low threshold (and set >> the latch frequently), and then gradually increase it up to the 1/4. >> >> That should work both for queries expecting only few rows and those >> producing a lot of data. > > I was thinking about this variant as well. One more alternative would be > latching the leader once a worker has produced 1/Nth of the LIMIT where > N is the number of workers. Both variants have the disadvantage that > there are still corner cases where the latch is set too late; but it > would for sure be much better than what we have today. > > I also did some profiling and - at least on my development laptop with 8 > physical cores - the original example, motivating the batching change is > slower than when it's disabled by commenting out: > > if (force_flush || mqh->mqh_send_pending > (mq->mq_ring_size >> 2)) > > SET parallel_tuple_cost TO 0; > CREATE TABLE b (a int); > INSERT INTO b SELECT generate_series(1, 200000000); > ANALYZE b; > EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM b; > > Gather (cost=1000.00..1200284.61 rows=200375424 width=4) (actual > rows=200000000 loops=1) > Workers Planned: 7 > Workers Launched: 7 > -> Parallel Seq Scan on b (cost=0.00..1199284.61 rows=28625061 > width=4) (actual rows=25000000 loops=8) > > Always latch: 19055 ms > Batching: 19575 ms > > If I find some time, I'll play around a bit more and maybe propose a patch. > OK. Once you have a WIP patch maybe share it and I'll try to do some profiling too. >>> ... >>> >>> We would need something similar to CHECK_FOR_INTERRUPTS() which returns >>> a NULL slot if a parallel worker is supposed to stop execution (we could >>> e.g. check if the queue got detached). Or could we amend >>> CHECK_FOR_INTERRUPTS() to just stop the worker gracefully if the queue >>> got detached? >>> >> That sounds reasonable, but I'm not very familiar the leader-worker >> communication, so no opinion on how it should be done. > > I think an extra macro that needs to be called from dozens of places to > check if parallel execution is supposed to end is the least preferred > approach. I'll read up more on how CHECK_FOR_INTERRUPTS() works and if > we cannot actively signal the workers that they should stop. > IMHO if this requires adding another macro to a bunch of ad hoc places is rather inconvenient. It'd be much better to fix this in a localized manner (especially as it seems related to a fairly specific place). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: