Thread: Re: RFC/PoC: GUC option to enable tuple queue autoflush for parallel workers
Re: RFC/PoC: GUC option to enable tuple queue autoflush for parallel workers
From
Francesco Degrassi
Date:
Hello, I hope bumping up this is not frowned upon. Any chance we can get any feedback? Thanks and best regards Francesco On Thu, 26 Sept 2024 at 16:15, Francesco Degrassi <francesco.degrassi@optionfactory.net> wrote: > > Hi all. A brief overview of our use case follows. > > We are developing a foreign data wrapper which employs parallel scan > support and predicate pushdown; given the types of queries we run, > foreign scans can be very long and often return very few rows. > > As the scan can be very long and slow, we'd like to provide partial > results to the user as rows are being returned. We found two problems > with that: > 1. Leader backend would not poll the parallel workers queue until it > itself found a row to return; we worked around it by turning > `parallel_leader_participation` to off. > 2. Parallel workers tuple queues have buffering, and are not flushed > until a certain fill threshold is reached; as our queries yield few > result rows, oftentimes these rows would only get returned at the end > of the (very long) scan. > > The proposal is to add a `parallel_tuplequeue_autoflush` GUC (bool, > default false) that would force every row returned by a parallel > worker to be immediately flushed to the leader; this was already the > case before v15, so it simply allows to opt for the previous > behaviour. > > This would be achieved by configuring a `auto_flush` field on > `TQueueDestReceiver`, so that `tqueueReceiveSlot` would pass > `force_flush` when calling `shm_mq_send`. > > The attached patch, tested on master @ 1ab67c9dfaadda , is a poc > tentative implementation. > Based on feedback, we're available to work on a complete and properly > documented patch. > > Thanks in advance for your consideration. > > Regards, > Francesco