One query run twice in parallel results in huge performance decrease - Mailing list pgsql-performance

From Jan Michel
Subject One query run twice in parallel results in huge performance decrease
Date
Msg-id 5299103F.1040307@mueschelsoft.de
Whole thread Raw
Responses Re: One query run twice in parallel results in huge performance decrease
Re: One query run twice in parallel results in huge performance decrease
List pgsql-performance
Dear all,
I have a quite strange problem running an extensive query on geo data
checking for crossing ways. I don't know if this is a postgres or postgis
problem, but I hope you can help. Running one thread is no problem, it
finishes within 10-15 minutes. Run two of those queries in parallel and
they will not finish within 24 hours. It is definitely not a caching or I/O
problem.

First, the environment:
Running on a large server (32 cores, 128 GB RAM, fast RAID disks)
I tested psql 8.1 / 9.1 / 9.3 and postgis 1.5 and 2.1.0 on Debian 6 and
OpenSuse 12.3. All behave similar. The pgsql server settings were
optimized using pgtune, wal logging and autovacuum is off.

I'm working on a set of databases, each 5-10 GB big filled with OSM
geo data. I run many different queries, and I know the server can handle
up to 8 parallel tasks without a decrease in performance compared to a
single thread. Most data is kept in the cache and almost no read access
to the disk needs to be done.
Everything works well, despite one query, that runs on a table with ~ 1M
entries. It searches for ways crossing each other:
http://etherpad.netluchs.de/pgquery
(The definition of the source table is included as well)

Here is the explain analyze of the query:
http://explain.depesz.com/s/fAcV
As you can see, the row estimate is far off, but the runtime of 11 minutes
is acceptable, I think.

When I run a second instance of this query in a unrelated database on the
same server, they take 100% CPU, no iowait and they do not finish even
after more than a day.
An explain done directly before executing the query shows a huge cost
estimate and varying different plans:
http://explain.depesz.com/s/XDR
http://explain.depesz.com/s/SeG

How can two queries have such a strong influence on each other? Especially
when the host server could handle even ten queries without problems?
And most important: What can I do?

Thank you all in advance for your help!
Jan


pgsql-performance by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Postgresql in a Virtual Machine
Next
From: Jeff Janes
Date:
Subject: Re: One query run twice in parallel results in huge performance decrease