Slow ext'd query via client native implementation vs. libpq & simple psql - Mailing list pgsql-performance

From Richard Michael
Subject Slow ext'd query via client native implementation vs. libpq & simple psql
Date
Msg-id CABR0jERv5L7U0eYyNxv3ZES9HbJj9r2KPK3_=oPMBXTbTC+oXw@mail.gmail.com
Whole thread Raw
Responses Re: Slow ext'd query via client native implementation vs. libpq &simple psql  (Justin Pryzby <pryzby@telsasoft.com>)
Re: Slow ext'd query via client native implementation vs. libpq & simple psql  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Hello,

On PG 12.2, I am analyzing a performance problem when using a client
(Elixir/postgrex) querying via the extended query protocol.  I am comparing with
psql and a C program.  Logs for all three follow this short explanation.

The query is trivial: `SELECT [cols] FROM t WHERE id = X` on a 65K row table.

The Elixir client executes this as an extended query in >500-700ms, very slow.
If relevant, the client does not use libpq, it is a native implementation.

A simple query via psql `PREPARE q AS ... ; EXECUTE q;` executes in ~130ms.
(Aside, please let me know if psql can execute extended queries.)

To compare another extended query protocol client, I wrote a tiny C program
using libpq PQprepare()/PQexecPrepared() executes in ~100ms.

I have tried to make the three tests as similar as possible; all are via
localhost and use named statements.

All use an identical query plan. There is a btree index on the WHERE col, but
the table is small enough it is not used.

The above durations are consistent across server restart, reboot, and repetition
(i.e.  still >500ms if run multiple times), so it seems independent of
filesystem caching, buffers, etc.

Obviously the client's query execution is somehow different, but I do not know
what/why.

I have enabled auto_explain min_duration (0), analyze, buffers, verbose and
settings.  What more can I log or do to postgres to understand why the client is
behaving poorly?  Would wireshark on client messages reveal anything postgres
can't log?

Other suggestions much appreciated as well.

(I happen to be on OSX, but I can test elsewhere if necessary.)

Regards,
richard


Elixir/postgrex extended query: (always >500ms)
-------------------------------
2020-03-11 13:46:20.090 EDT [3401] LOG:  connection received: host=127.0.0.1 port=50128
2020-03-11 13:46:20.096 EDT [3401] LOG:  connection authorized: user=testuser database=biodata
2020-03-11 13:46:20.141 EDT [3401] LOG:  duration: 1.138 ms  parse ecto_98: SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute" FROM "genes" AS g0 WHERE (g0."genome_id" = 1)
2020-03-11 13:46:20.144 EDT [3401] LOG:  duration: 2.292 ms  bind ecto_98: SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute" FROM "genes" AS g0 WHERE (g0."genome_id" = 1)
2020-03-11 13:46:20.658 EDT [3401] LOG:  duration: 513.791 ms  execute ecto_98: SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute" FROM "genes" AS g0 WHERE (g0."genome_id" = 1)
2020-03-11 13:46:20.658 EDT [3401] LOG:  duration: 513.792 ms  plan:
Query Text: SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute" FROM "genes" AS g0 WHERE (g0."genome_id" = 1)
Seq Scan on public.genes g0  (cost=0.00..6591.62 rows=60632 width=224) (actual time=0.523..33.229 rows=60623 loops=1)
 Output: id, genome_id, seqid, source, feature, start, "end", score, strand, phase, attribute
 Filter: (g0.genome_id = 1)
 Rows Removed by Filter: 3907
 Buffers: shared read=5785
2020-03-11 13:46:20.887 EDT [3401] LOG:  disconnection: session time: 0:00:00.796 user=testuser database=biodata host=127.0.0.1 port=50128



psql simple PREPARE/EXECUTE:
----------------------------
2020-03-11 13:46:40.021 EDT [3438] LOG:  connection received: host=::1 port=50129
2020-03-11 13:46:40.044 EDT [3438] LOG:  connection authorized: user=testuser database=biodata application_name=psql
2020-03-11 13:46:40.106 EDT [3438] LOG:  duration: 58.071 ms  plan:
Query Text: PREPARE q AS SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute" FROM "genes" AS g0 WHERE (g0."genome_id" = 1); EXECUTE q;
Seq Scan on public.genes g0  (cost=0.00..6591.62 rows=60632 width=224) (actual time=0.060..25.542 rows=60623 loops=1)
 Output: id, genome_id, seqid, source, feature, start, "end", score, strand, phase, attribute
 Filter: (g0.genome_id = 1)
 Rows Removed by Filter: 3907
 Buffers: shared hit=42 read=5743
2020-03-11 13:46:40.182 EDT [3438] LOG:  duration: 136.173 ms  statement: PREPARE q AS SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute" FROM "genes" AS g0 WHERE (g0."genome_id" = 1); EXECUTE q;
2020-03-11 13:46:40.182 EDT [3438] DETAIL:  prepare: PREPARE q AS SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute" FROM "genes" AS g0 WHERE (g0."genome_id" = 1); EXECUTE q;
2020-03-11 13:46:41.140 EDT [3438] LOG:  disconnection: session time: 0:00:01.119 user=testuser database=biodata host=::1 port=50129



C libpq extended query:
-----------------------
2020-03-11 13:50:00.220 EDT [4299] LOG:  connection received: host=::1 port=50137
2020-03-11 13:50:00.232 EDT [4299] LOG:  connection authorized: user=testuser database=biodata
2020-03-11 13:50:00.234 EDT [4299] LOG:  duration: 0.437 ms  parse foo: SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute" FROM genes AS g0 WHERE (g0."genome_id" = 1)
2020-03-11 13:50:00.235 EDT [4299] LOG:  duration: 0.489 ms  bind foo: SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute" FROM genes AS g0 WHERE (g0."genome_id" = 1)
2020-03-11 13:50:00.342 EDT [4299] LOG:  duration: 106.874 ms  execute foo: SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute" FROM genes AS g0 WHERE (g0."genome_id" = 1)
2020-03-11 13:50:00.342 EDT [4299] LOG:  duration: 106.861 ms  plan:
Query Text: SELECT g0."id", g0."genome_id", g0."seqid", g0."source", g0."feature", g0."start", g0."end", g0."score", g0."strand", g0."phase", g0."attribute" FROM genes AS g0 WHERE (g0."genome_id" = 1)
Seq Scan on public.genes g0  (cost=0.00..6591.62 rows=60632 width=224) (actual time=0.056..25.049 rows=60623 loops=1)
 Output: id, genome_id, seqid, source, feature, start, "end", score, strand, phase, attribute
 Filter: (g0.genome_id = 1)
 Rows Removed by Filter: 3907
 Buffers: shared hit=74 read=5711
2020-03-11 13:50:00.345 EDT [4299] LOG:  disconnection: session time: 0:00:00.125 user=testuser database=biodata host=::1 port=50137

pgsql-performance by date:

Previous
From: David Rowley
Date:
Subject: Re: pg12 partitions show bad performance vs pg96
Next
From: Justin Pryzby
Date:
Subject: Re: Slow ext'd query via client native implementation vs. libpq &simple psql