Thread: Why is there a tenfold difference between Postgres's alleged query execution time and packet transmission time?

Why is there a tenfold difference between Postgres's alleged query execution time and packet transmission time?

From

Rollo Konig-Brock

Date:

14 April 2021, 23:10:40

Hey all,

I've been pulling my hair out over this for days now, as I'm trying to build a low latency application. Databases should be fast, but I can not work out why so much latency is added between the actual database process and the application code. For simple queries, that should take less than a millisecond, this mystery latency is by far the biggest performance hit.

For example, I have a very simple table on a local Postgres database, which takes 324us to query. This is absolutely fine.

postgres=# EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) select count("ID") from example t;
-[ RECORD 1 ]------------------------------------------------------
QUERY PLAN | Aggregate (actual rows=1 loops=1)
-[ RECORD 2 ]------------------------------------------------------
QUERY PLAN |   ->  Seq Scan on example t (actual rows=4119 loops=1)
-[ RECORD 3 ]------------------------------------------------------
QUERY PLAN | Planning Time: 0.051 ms
-[ RECORD 4 ]------------------------------------------------------
QUERY PLAN | Execution Time: 0.324 ms

But then if I want to connect to the database in Python, getting the actual data takes over 2500us! The loopback interface is not the problem here, I sanity checked that (it adds 100us at a stretch).

In [1]: %time cursor.execute("select count(\"ID\") from example;");r = cursor.fetchmany()
CPU times: user 316 µs, sys: 1.12 ms, total: 1.44 ms
Wall time: 2.73 ms

Investigating further, I opened up WireShark to see how long the packets themselves take.

From this I can tell that Postgres took 2594us to send the data. This kind of overhead for an loopback IPC call is way beyond what I would expect. Why is this? And how can I optimise this away?

This is a problem for the software ecosystem, because people start thinking databases are too slow to do simple lookups, and they start caching queries in Redis that take 300us to execute, but 3ms to actually fetch back to the client. It adds up to an awful lot of latency.

As an aside, I've been really miffed by how bad database IPC performance has been in general. Oracle has a problem where each ~1500 bytes makes receiving the packet 800us slower (on a gigabit network).

I'd really love to know what's happening here.

Rollo

Re: Why is there a tenfold difference between Postgres's alleged query execution time and packet transmission time?

From

Tom Lane

Date:

16 April 2021, 13:27:38

Rollo Konig-Brock <rollokb@gmail.com> writes:
> I've been pulling my hair out over this for days now, as I'm trying to
> build a low latency application. Databases should be fast, but I can not
> work out why so much latency is added between the actual database process
> and the application code. For simple queries, that should take less than a
> millisecond, this mystery latency is by far the biggest performance hit.

Well, for sub-millisecond queries, I'm afraid that EXPLAIN's numbers
omit a lot of the overhead that you have to think about for such short
queries.  For instance:

* Query parsing (both the grammar and parse analysis).  You could get
a handle on how much this is relative to what EXPLAIN knows about by
enabling log_parser_stats, log_planner_stats, and log_executor_stats.
Depending on workload, you *might* be able to ameliorate these costs
by using prepared queries, although that cure can easily be worse
than the disease.

* I/O conversions, notably both formatting of output data and charset
encoding conversions.  You can possibly ameliorate these by using
binary output and making sure that the client and server use the same
encoding.

* SSL encryption.  This is probably not enabled on a local loopback
connection, but it doesn't hurt to check.

* Backend catalog cache filling.  It doesn't pay to make a connection
for just one or a few queries, because a newly started backend
process won't really be up to speed until it's populated its caches
with catalog data that's relevant to your queries.  I think most
(not all) of this cost is incurred during parse analysis, which
would help to hide it from EXPLAIN-based investigation.

            regards, tom lane