Re: [PATCH] pg_dump: lock tables in batches - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [PATCH] pg_dump: lock tables in batches
Date
Msg-id 20221207174439.ii2stmiv45aghbnw@awork3.anarazel.de
Whole thread Raw
In response to Re: [PATCH] pg_dump: lock tables in batches  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2022-12-07 12:28:03 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-12-07 10:44:33 -0500, Tom Lane wrote:
> >> I have a strong sense of deja vu here.  I'm pretty sure I experimented
> >> with this idea last year and gave up on it.  I don't recall exactly
> >> why, but either it didn't show any meaningful performance improvement
> >> for me or there was some actual downside (that I'm not remembering
> >> right now).
> 
> > IIRC the case we were looking at around 989596152 were CPU bound workloads,
> > rather than latency bound workloads. It'd not be surprising to have cases
> > where batching LOCKs helps latency, but not CPU bound.
> 
> Yeah, perhaps.  Anyway my main point is that I don't want to just assume
> this is a win; I want to see some actual performance tests.

FWIW, one can simulate network latency with 'netem' on linux. Works even for
'lo'.

ping -c 3 -n localhost

64 bytes from ::1: icmp_seq=1 ttl=64 time=0.035 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=0.049 ms
64 bytes from ::1: icmp_seq=3 ttl=64 time=0.043 ms

tc qdisc add dev lo root netem delay 10ms

64 bytes from ::1: icmp_seq=1 ttl=64 time=20.1 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=20.2 ms
64 bytes from ::1: icmp_seq=3 ttl=64 time=20.2 ms

tc qdisc delete dev lo root netem

64 bytes from ::1: icmp_seq=1 ttl=64 time=0.036 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=0.047 ms
64 bytes from ::1: icmp_seq=3 ttl=64 time=0.050 ms


> > I wonder if "manual" batching is the best answer. Alexander, have you
> > considered using libpq level pipelining?
> 
> I'd be a bit nervous about how well that works with older servers.

I don't think there should be any problem - E.g. pgjdbc has been using
pipelining for ages.

Not sure if it's the right answer, just to be clear. I suspect that eventually
we're going to need to have a special "acquire pg_dump locks" function that is
cheaper than retail lock acquisition and perhaps deals more gracefully with
exceeding max_locks_per_transaction. Which would presumably not be pipelined.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Error-safe user functions
Next
From: "David G. Johnston"
Date:
Subject: Re: Error-safe user functions