Home > mailing lists

Planner performance extremely affected by an hanging transaction (20-30 times)? - Mailing list pgsql-performance

From	Bartłomiej Romański
Subject	Planner performance extremely affected by an hanging transaction (20-30 times)?
Date	September 20, 2013 03:49:45
Msg-id	CAC6=Lj7fYUWzcKLU6X7-8D1q+ijvBrGmwS2dw4mixhH8F8KSAw@mail.gmail.com Whole thread Raw
Responses	Re: Planner performance extremely affected by an hanging transaction (20-30 times)? Re: Planner performance extremely affected by an hanging transaction (20-30 times)?
List	pgsql-performance

Tree view

Hi all

We're experiencing a very strange performance issue. Our setup is a bit more complicated, but we've managed to isolate and replicate the core problem. Here's what we observe:

We took a strong machine (128 GB RAM, 8-core CPU, SSD drives...) and installed a fresh copy of PostgreSQL 9.2 (Ubuntu Server 12.04 LTS, default configuration).

Then, we created a test database with the following schema:

CREATE TABLE offers

(

id bigserial NOT NULL,

name character varying NOT NULL,

category_id bigint NOT NULL,

CONSTRAINT offers_pkey PRIMARY KEY (id)

);

CREATE TABLE categories

(

id bigserial NOT NULL,

name character varying NOT NULL,

CONSTRAINT categories_pkey PRIMARY KEY (id)

);

and populated it with in the following way:

insert into categories (name) select 'category_' || x from generate_series(1,1000) as x;

insert into offers (name, category_id) select 'offer_' || x, floor(random() * 1000) + 1 from generate_series(1,1000*1000) as x;

Finally, we created a python script to make simple queries in a loop:

while True:

id = random.randrange(1, 1000 * 1000)

db.execute('select offers.id, offers.name, categories.id, categories.name from offers left join categories on categories.id = offers.category_id where offers.id = %s', (id,))

print db.fetchall()

We start 20 instances simultaneously and measure performance:

parallel -j 20 ./test.py -- $(seq 1 20) | pv -i1 -l > /dev/null

Normally we observe about 30k QPS what's a satisfying result (without any tuning at all).

The problem occurs when we open a second console, start psql and type:

pgtest=> begin; insert into categories (name) select 'category_' || x from generate_series(1,1000) as x;

We start a transaction and insert 1k records to the 'categories' table. After that performance of the benchmark described above immediately drops to about 1-2k QPS. That's 20-30 times! After closing the transaction (committing or aborting - doesn't matter) it immediately jumps back to 30k QPS.

Restarting the running script and other simple tricks do not change anything. The hanging, open transaction is causing a huge slowdown. What's more when doing similar insert (even much larger) to the 'offers' table we do not observe this effect.

We analyzed the problem a bit deeper and find out that the actual query execution times are not affected that much. They are constantly close to 0.5 ms. This can be observed in a server log (after enabling appropriate option) and this can be found in 'explain analyze...' result. Also the query plan returned do not change and looks optimal (pkey scan for 'offers' + pkey scan for 'categories').

After a few random thought we've finally turned the 'log_planner_stats' option and found out that the planner executions times are highly affected by the hanging transaction. Here's the typical output in the initial situation:

2013-09-17 21:54:59 UTC LOG: PLANNER STATISTICS

2013-09-17 21:54:59 UTC DETAIL: ! system usage stats:

! 0.000137 elapsed 0.000000 user 0.000000 system sec

! [2.169670 user 0.383941 sys total]

! 0/0 [0/11520] filesystem blocks in/out

! 0/0 [0/7408] page faults/reclaims, 0 [0] swaps

! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent

! 0/0 [1362/7648] voluntary/involuntary context switches

And here's a typical output with a hanging transaction:

2013-09-17 21:56:12 UTC LOG: PLANNER STATISTICS

2013-09-17 21:56:12 UTC DETAIL: ! system usage stats:

! 0.027251 elapsed 0.008999 user 0.001000 system sec

! [32.722025 user 3.550460 sys total]

! 0/0 [0/115128] filesystem blocks in/out

! 0/0 [0/7482] page faults/reclaims, 0 [0] swaps

! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent

! 1/6 [12817/80202] voluntary/involuntary context switches

As you can see it takes over 100 times more time when the extra transaction is open!

Any ideas why's that? It extremely affects total query time.

I know that using prepared statements to solves the problem completely, so we can deal with it, but we've already spend a lot of time to go that far and I'm personally a bit curious what's the fundamental reason for such a behavior.

I'll be very thankful for any explanation what's going on here!

Thanks,

pgsql-performance by date:

From: Josh Berkus
Date: 19 September 2013, 23:47:20
Subject: Why is n_distinct always -1 for range types?

From: Julian
Date: 20 September 2013, 04:43:02
Subject: Re: Planner performance extremely affected by an hanging transaction (20-30 times)?

Planner performance extremely affected by an hanging transaction (20-30 times)? - Mailing list pgsql-performance

Previous

Next