Re: Built-in connection pooling - Mailing list pgsql-hackers
From | Konstantin Knizhnik |
---|---|
Subject | Re: Built-in connection pooling |
Date | |
Msg-id | a58585ab-413f-cacd-3632-176bd27b7198@postgrespro.ru Whole thread Raw |
In response to | Re: Built-in connection pooling (Vladimir Sitnikov <sitnikov.vladimir@gmail.com>) |
Responses |
Re: Built-in connection pooling
|
List | pgsql-hackers |
On 01.02.2018 15:21, Vladimir Sitnikov wrote:
Konstantin>I have obtained more results with YCSB benchmark and built-in connection poolingCould you provide more information on the benchmark setup you have used?For instance: benchmark library versions, PostgreSQL client version, additional/default benchmark parameters.
I am using the latest Postgres sources with applied connection pooling patch.
I have not built YCSB myself, use existed installation.
To launch tests I used the following YCSB command line:
To load data:
YCSB_MAXRUNTIME=60 YCSB_OPS=1000000000 YCSB_DBS="pgjsonb-local" YCSB_CFG="bt" YCSB_CLIENTS="250" YCSB_WORKLOADS="load_a" ./ycsb.sh
To run test:
YCSB_MAXRUNTIME=60 YCSB_OPS=1000000000 YCSB_DBS="pgjsonb-local" YCSB_CFG="bt" YCSB_CLIENTS="250 500 750 1000" YCSB_WORKLOADS="run_a" ./ycsb.sh
$ cat config/pgjsonb-local.dat
db.driver=org.postgresql.Driver
db.url=jdbc:postgresql://localhost:5432/ycsb
db.user=ycsb
db.passwd=ycsb
db.batchsize=100
jdbc.batchupdateapi=true
table=usertable
Sorry, I am not sure that I completely understand your question.Konstantin>Postgres shows significant slow down with increasing number of connections in case of conflicting updates.
Konstantin>Built-in connection pooling can somehow eliminate this problemCan you please clarify how connection pooling eliminates slow down?Is the case as follows?1) The application updates multiple of rows in a single transaction2) There are multiple concurrent threads3) The threads update the same rows at the same timeIf that is the case, then the actual workload is different each time you vary connection pool size.For instance, if you use 1 thread, then the writes become uncontended.Of course, you might use just it as a "black box" workload, however I wonder if that kind of workload ever appears in a real-life applications. I would expect for the applications to update the same row multiple times, however I would expect the app is doing subsequent updates, not the concurrent ones.On the other hand, as you vary the pool size, the workload varies as well (the resulting database contents is different), so it looks like comparing apples to oranges.Vladimir
YCSB (Yahoo! Cloud Serving Benchmark) framework is essentially multiclient benchmark which assumes larger number concurrent requests to the database.
Requests themselves are used to be very simple (benchmark emulates key-vlaue storage).
In my tests I perform measurements for 250, 500, 750 and 1000 connections.
One of the main problems of Postgres is significant degrade of performance in case of concurrent write access by multiple transactions to the same sows.
This is why performance of pgbench and YCSB benchmark significantly (more then linear) degrades with increasing number of client connections especially in case o Zipf distribution
(significantly increasing possibility of conflict).
Connection pooling allows to fix number of backends and serve almost any number of connections using fixed size of backends.
So results are almost the same for 250, 500, 750 and 1000 connections.
The problem is choosing optimal number of backends.
For readonly pgbench best results are achieved for 300 backends, for YCSB with 5% of updates - for 70 backends, for YCSB with 50% of updates - for 30 backends.
So something definitely need to be changes in Postgres locking mechanism.
Connection pooling allows to minimize contention on resource and degrade of performance caused by such contention.
But unfortunately it is not a silver bullet fixing all Postgres scalability problems.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: