Re: High-volume writes - what is the max throughput possible - Mailing list pgsql-performance

From Frits Jalvingh
Subject Re: High-volume writes - what is the max throughput possible
Date
Msg-id CAKhTGFWrx06KXEzS4K3Z1aCcEsqB3f62pbNdrpyJUOo7QD7k3w@mail.gmail.com
Whole thread Raw
In response to High-volume writes - what is the max throughput possible  (Geervan Hayatnagarkar <pande.arti@gmail.com>)
Responses Re: High-volume writes - what is the max throughput possible  (Hannu Krosing <hannuk@google.com>)
List pgsql-performance
It completely depends on a lot of factors of course, so these numbers are meaningless.
It depends at the very least on:
* The hardware (CPU, disk type + disk connection)
* The size of the records read/written
* The presence of indices and constraints.

So, adding some other meaningless numbers to at least give some idea: we have specialized load processes using Postgres where we reach insert counts of around one million records per second. This is the *compound* insert count of multiple parallel streams that read data from one table and insert it in one or more other tables. So you can definitely go faster, but it depends in great amount on how you process the data and what you run on.
If you run on clouds (at least on Azure, which we use) you can have other nasty surprises as they do not really seem to have disks but instead a set of old people writing the data onto paper... On normal (non-ephemeral) disks you will not get close to these numbers.

Things to do are:
* use the copy command to do the actual insert. We wrote a special kind of "insert" that provides the input stream for the copy command dynamically as data becomes available.
* Do the reading of data in a different thread than the writing, and have a large records buffer between the two processes. In that way reading the data can continue while the writing process writes.

Regards,

Frits


On Fri, Mar 26, 2021 at 1:20 PM Geervan Hayatnagarkar <pande.arti@gmail.com> wrote:
Hi,

We are trying to find maximum throughput in terms of transactions per second (or simultaneous read+write SQL operations per second) for a use case that does one ACID transaction (consisting of tens of reads and tens of updates/ inserts) per incoming stream element on a high-volume high-velocity stream of data.

Our load test showed us that PostgreSQL version 11/12 could support upto 10,000 to 11,000 such ACID transactions per second = 55K read SQL operations per second along with simultaneous 77 K write SQL operations per second (= total 132 K total read+write SQL operations per second) 

The underlying hardware limit is much more. But is this the maximum PostgreSQL can support? If not what are the server tuning parameters we should consider for this scale of throughput ? 

Thanks,
Arti

pgsql-performance by date:

Previous
From: Nikhil Shetty
Date:
Subject: Re: How do we hint a query to use index in postgre
Next
From: Hannu Krosing
Date:
Subject: Re: High-volume writes - what is the max throughput possible