On Thu, Aug 08, 2024 at 06:18:38PM -0400, Corey Huinker wrote:
> I think the underlying mechanism is basically solid, but I have one
> question: isn't this the ideal case for using libpq pipelining? That would
> allow subsequent tasks to launch while the main loop slowly gets around to
> clearing off completed tasks on some other connection.
I'll admit I hadn't really considered pipelining, but I'm tempted to say
that it's probably not worth the complexity. Not only do most of the tasks
have only one step, but even tasks like the data types check are unlikely
to require more than a few queries for upgrades from supported versions.
Furthermore, most of the callbacks should do almost nothing for a given
upgrade, and since pg_upgrade runs on the server, client/server round-trip
time should be pretty low.
Perhaps pipelining would make more sense if we consolidated the tasks a bit
better, but when I last looked into that, I didn't see a ton of great
opportunities that would help anything except for upgrades from really old
versions. Even then, I'm not sure if pipelining is worth it.
--
nathan