Thread: Help with a good mental model for estimating PostgreSQL throughput

Help with a good mental model for estimating PostgreSQL throughput

From
David Ventimiglia
Date:
Hello!

Can someone help me develop a good mental model for estimating PostgreSQL throughput?  Here's what I mean.  Suppose I have:
  • 1000 connections
  • typical query execution time of 1ms
  • but additional network latency of 100ms
What if at all would be an estimate of the number of operations that can be performed within 1 second?  My initial guess would be ~10000, but then perhaps I'm overlooking something.  I expect a more reliable figure would be obtained through testing, but I'm looking for an a priori back-of-the-envelope estimate.  Thanks!

Best,
David

Re: Help with a good mental model for estimating PostgreSQL throughput

From
Laurenz Albe
Date:
On Mon, 2023-10-30 at 08:05 -0700, David Ventimiglia wrote:
> Can someone help me develop a good mental model for estimating PostgreSQL throughput?
> Here's what I mean.  Suppose I have:
>  * 1000 connections
>  * typical query execution time of 1ms
>  * but additional network latency of 100ms
> What if at all would be an estimate of the number of operations that can be performed
> within 1 second?  My initial guess would be ~10000, but then perhaps I'm overlooking
> something.  I expect a more reliable figure would be obtained through testing, but
> I'm looking for an a priori back-of-the-envelope estimate.  Thanks!

It depends on the number of cores, if the workload is CPU bound.
If the workload is disk bound, look for the number of I/O requests a typical query
needs, and how many of them you can perform per second.

The network latency might well be a killer.

Use pgBouncer with transaction mode pooling.

Yours,
Laurenz Albe



Re: Help with a good mental model for estimating PostgreSQL throughput

From
David Ventimiglia
Date:
Thanks!  Let's say there are 10 cores, the workload is not CPU bound, and there is a connection pooler like pgBouncer in place.  Would the number of operations more likely be:
  • 1000 ms / total ms per operation * number of cores = 1000 ms / 101 ms * 10 = ~100
  • 1000 ms / total ms per operation * number of connections = 1000 ms / 101 ms * 1000 = ~10000
  • something else
  • impossible to determine without more information
Best,
David

On Mon, Oct 30, 2023 at 8:46 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Mon, 2023-10-30 at 08:05 -0700, David Ventimiglia wrote:
> Can someone help me develop a good mental model for estimating PostgreSQL throughput?
> Here's what I mean.  Suppose I have:
>  * 1000 connections
>  * typical query execution time of 1ms
>  * but additional network latency of 100ms
> What if at all would be an estimate of the number of operations that can be performed
> within 1 second?  My initial guess would be ~10000, but then perhaps I'm overlooking
> something.  I expect a more reliable figure would be obtained through testing, but
> I'm looking for an a priori back-of-the-envelope estimate.  Thanks!

It depends on the number of cores, if the workload is CPU bound.
If the workload is disk bound, look for the number of I/O requests a typical query
needs, and how many of them you can perform per second.

The network latency might well be a killer.

Use pgBouncer with transaction mode pooling.

Yours,
Laurenz Albe

Re: Help with a good mental model for estimating PostgreSQL throughput

From
Laurenz Albe
Date:
On Mon, 2023-10-30 at 08:59 -0700, David Ventimiglia wrote:
> On Mon, Oct 30, 2023 at 8:46 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Mon, 2023-10-30 at 08:05 -0700, David Ventimiglia wrote:
> > > Can someone help me develop a good mental model for estimating PostgreSQL throughput?
> > > Here's what I mean.  Suppose I have:
> > >  * 1000 connections
> > >  * typical query execution time of 1ms
> > >  * but additional network latency of 100ms
> > > What if at all would be an estimate of the number of operations that can be performed
> > > within 1 second?  My initial guess would be ~10000, but then perhaps I'm overlooking
> > > something.  I expect a more reliable figure would be obtained through testing, but
> > > I'm looking for an a priori back-of-the-envelope estimate.  Thanks!
> >
> > It depends on the number of cores, if the workload is CPU bound.
> > If the workload is disk bound, look for the number of I/O requests a typical query
> > needs, and how many of them you can perform per second.
> >
> > The network latency might well be a killer.
> >
> > Use pgBouncer with transaction mode pooling.
>
> Thanks!  Let's say there are 10 cores, the workload is not CPU bound, and there is a
> connection pooler like pgBouncer in place.  Would the number of operations more likely be:
>
>  * 1000 ms / total ms per operation * number of cores = 1000 ms / 101 ms * 10 = ~100
>  * 1000 ms / total ms per operation * number of connections = 1000 ms / 101 ms * 1000 = ~10000
>  * something else
>  * impossible to determine without more information

If the workload is not CPU bound, it is probably disk bound, and you have to look at
the number if I/O requests.

If you look at the CPU, the second calculation should be more to the point.
However, if one request by the customer results in 10 database requests, the request
will already take 2 seconds due to the network latency, even though it causes next
to no load on the database.

Yours,
Laurenz Albe