Re: Append with naive multiplexing of FDWs - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: Append with naive multiplexing of FDWs |
Date | |
Msg-id | 20191130192611.GB4326@momjian.us Whole thread Raw |
In response to | Re: Append with naive multiplexing of FDWs (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: Append with naive multiplexing of FDWs
Re: Append with naive multiplexing of FDWs |
List | pgsql-hackers |
On Sun, Nov 17, 2019 at 09:54:55PM +1300, Thomas Munro wrote: > On Sat, Sep 28, 2019 at 4:20 AM Bruce Momjian <bruce@momjian.us> wrote: > > On Wed, Sep 4, 2019 at 06:18:31PM +1200, Thomas Munro wrote: > > > A few years back[1] I experimented with a simple readiness API that > > > would allow Append to start emitting tuples from whichever Foreign > > > Scan has data available, when working with FDW-based sharding. I used > > > that primarily as a way to test Andres's new WaitEventSet stuff and my > > > kqueue implementation of that, but I didn't pursue it seriously > > > because I knew we wanted a more ambitious async executor rewrite and > > > many people had ideas about that, with schedulers capable of jumping > > > all over the tree etc. > > > > > > Anyway, Stephen Frost pinged me off-list to ask about that patch, and > > > asked why we don't just do this naive thing until we have something > > > better. It's a very localised feature that works only between Append > > > and its immediate children. The patch makes it work for postgres_fdw, > > > but it should work for any FDW that can get its hands on a socket. > > > > > > Here's a quick rebase of that old POC patch, along with a demo. Since > > > 2016, Parallel Append landed, but I didn't have time to think about > > > how to integrate with that so I did a quick "sledgehammer" rebase that > > > disables itself if parallelism is in the picture. > > > > Yes, sharding has been waiting on parallel FDW scans. Would this work > > for parallel partition scans if the partitions were FDWs? > > Yeah, this works for partitions that are FDWs (as shown), but only for > Append, not for Parallel Append. So you'd have parallelism in the > sense that your N remote shard servers are all doing stuff at the same > time, but it couldn't be in a parallel query on your 'home' server, > which is probably good for things that push down aggregation and bring > back just a few tuples from each shard, but bad for anything wanting > to ship back millions of tuples to chew on locally. Do you think > that'd be useful enough on its own? Yes, I think so. There are many data warehouse queries that want to return only aggregate values, or filter for a small number of rows. Even OLTP queries might return only a few rows from multiple partitions. This would allow for a proof-of-concept implementation so we can see how realistic this approach is. > The problem is that parallel safe non-partial plans (like postgres_fdw > scans) are exclusively 'claimed' by one process under Parallel Append, > so with the patch as posted, if you modify it to allow parallelism > then it'll probably give correct answers but nothing prevents a single > process from claiming and starting all the scans and then waiting for > them to be ready, while the other processes miss out on doing any work > at all. There's probably some kludgy solution involving not letting > any one worker start more than X, and some space cadet solution > involving passing sockets around and teaching libpq to hand over > connections at certain controlled phases of the protocol (due to lack > of threads), but nothing like that has jumped out as the right path so > far. I am unclear how many queries can do any meaningful work until all shards have giving their full results. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
pgsql-hackers by date: