Re: Slow queries on simple index - Mailing list pgsql-performance
| From | Pavel Stehule |
|---|---|
| Subject | Re: Slow queries on simple index |
| Date | |
| Msg-id | CAFj8pRDGQL6SzLEtuwkQoO8uADLnOcFentffsJVwhaLKb4TuXg@mail.gmail.com Whole thread Raw |
| In response to | Slow queries on simple index ("Lillian Berry" <lillian@star-ark.net>) |
| List | pgsql-performance |
Hi
so 17. 1. 2026 v 15:47 odesílatel Lillian Berry <lillian@star-ark.net> napsal:
Logs on our PostgreSQL cluster show frequent slow reads on a simple query by unique index.
This issue has been present for some years, and survived multiple server moves as well as PostgreSQL version upgrades.
Vacuuming or reindexing the table does not seem to help.
Jan 17 13:38:13 database-node1-bd9aa148 pzqzs8zyvqrq1rq25z4drws8dhnbxa59-run-postgres[172632]: pluralkit-db-dataLOG: duration: 2322.842 ms execute <unnamed>: SELECT * FROM "members" WHERE "hid" = $1
Jan 17 13:38:15 database-node1-bd9aa148 pzqzs8zyvqrq1rq25z4drws8dhnbxa59-run-postgres[157893]: pluralkit-db-dataLOG: duration: 2155.033 ms execute _auto8: SELECT * FROM "members" WHERE "hid" = $1
Jan 17 13:38:16 database-node1-bd9aa148 pzqzs8zyvqrq1rq25z4drws8dhnbxa59-run-postgres[157826]: pluralkit-db-dataLOG: duration: 2164.851 ms execute _auto10: SELECT * FROM "members" WHERE "hid" = $1
Jan 17 13:38:19 database-node1-bd9aa148 pzqzs8zyvqrq1rq25z4drws8dhnbxa59-run-postgres[171076]: pluralkit-db-dataLOG: duration: 2039.040 ms execute _auto3: SELECT * FROM "members" WHERE "hid" = $1
Jan 17 13:38:20 database-node1-bd9aa148 pzqzs8zyvqrq1rq25z4drws8dhnbxa59-run-postgres[162682]: pluralkit-db-dataLOG: duration: 2189.700 ms execute _auto8: SELECT * FROM "members" WHERE "hid" = $1
Jan 17 13:38:22 database-node1-bd9aa148 pzqzs8zyvqrq1rq25z4drws8dhnbxa59-run-postgres[159824]: pluralkit-db-dataLOG: duration: 2137.666 ms execute _auto10: SELECT * FROM "members" WHERE "hid" = $1
Parameter detail has been redacted for privacy.
This is not reproducible on manual queries, whether using the same values or other random ones.e
There can be lot of possible reasons:
1. virtualization
2. network issues
3. application server lag
4. maybe by using transparent huge pages
It is pretty hard to identify these issues, because source of this issue can be everything in every layer
At the end I executed every second just simple select SELECT 1; and I checked the time
It can help with removing some hypotheses like application server issues or network issues.
Regards
Pavel
pluralkit=# explain (analyze, buffers, settings) select * from members where hid = 'exampl';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Index Scan using members_hid_key on members (cost=0.43..8.45 rows=1 width=838) (actual time=0.098..0.098 rows=0.00 loops=1)
Index Cond: (hid = 'exampl'::bpchar)
Index Searches: 1
Buffers: shared hit=3
Settings: effective_cache_size = '40GB'
Planning Time: 0.161 ms
Execution Time: 0.124 ms
(7 rows)
Table info:
pluralkit=# \d+ members
Table "public.members"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
------------------------+-----------------------------+-----------+----------+------------------------------------------+----------+-------------+--------------+-------------
id | integer | | not null | nextval('members_id_seq'::regclass) | plain | | |
hid | character(6) | | not null | | extended | | |
system | integer | | not null | nextval('members_system_seq'::regclass) | plain | | |
color | character(6) | | | | extended | | |
avatar_url | text | | | | extended | | |
name | text | | not null | | extended | | |
birthday | date | | | | plain | | |
pronouns | text | | | | extended | | |
description | text | | | | extended | | |
created | timestamp without time zone | | not null | timezone('utc'::text, CURRENT_TIMESTAMP) | plain | | |
display_name | text | | | | extended | | |
proxy_tags | proxy_tag[] | | not null | ARRAY[]::proxy_tag[] | extended | | |
keep_proxy | boolean | | not null | false | plain | | |
member_visibility | integer | | not null | 1 | plain | | |
message_count | integer | | not null | 0 | plain | | |
description_privacy | integer | | not null | 1 | plain | | |
name_privacy | integer | | not null | 1 | plain | | |
avatar_privacy | integer | | not null | 1 | plain | | |
birthday_privacy | integer | | not null | 1 | plain | | |
pronoun_privacy | integer | | not null | 1 | plain | | |
metadata_privacy | integer | | not null | 1 | plain | | |
allow_autoproxy | boolean | | not null | true | plain | | |
banner_image | text | | | | extended | | |
uuid | uuid | | | public.gen_random_uuid() | plain | | |
last_message_timestamp | timestamp without time zone | | | | plain | | |
webhook_avatar_url | text | | | | extended | | |
proxy_privacy | integer | | not null | 1 | plain | | |
tts | boolean | | not null | false | plain | | |
banner_privacy | integer | | not null | 1 | plain | | |
Indexes:
"members_pkey" PRIMARY KEY, btree (id)
"members_by_system" btree (system)
"members_hid_key" UNIQUE CONSTRAINT, btree (hid)
"members_uuid_idx" btree (uuid)
Check constraints:
"members_avatar_privacy_check" CHECK (avatar_privacy = ANY (ARRAY[1, 2]))
"members_banner_privacy_check" CHECK (banner_privacy = ANY (ARRAY[1, 2]))
"members_birthday_privacy_check" CHECK (birthday_privacy = ANY (ARRAY[1, 2]))
"members_description_privacy_check" CHECK (description_privacy = ANY (ARRAY[1, 2]))
"members_member_privacy_check" CHECK (member_visibility = ANY (ARRAY[1, 2]))
"members_metadata_privacy_check" CHECK (metadata_privacy = ANY (ARRAY[1, 2]))
"members_name_privacy_check" CHECK (name_privacy = ANY (ARRAY[1, 2]))
"members_pronoun_privacy_check" CHECK (pronoun_privacy = ANY (ARRAY[1, 2]))
"members_proxy_privacy_check" CHECK (proxy_privacy = ANY (ARRAY[1, 2]))
Foreign-key constraints:
"members_system_fkey" FOREIGN KEY (system) REFERENCES systems(id) ON DELETE CASCADE
Referenced by:
TABLE "autoproxy" CONSTRAINT "autoproxy_autoproxy_member_fkey" FOREIGN KEY (autoproxy_member) REFERENCES members(id) ON DELETE SET NULL
TABLE "group_members" CONSTRAINT "group_members_member_id_fkey" FOREIGN KEY (member_id) REFERENCES members(id) ON DELETE CASCADE
TABLE "member_guild" CONSTRAINT "member_guild_member_fkey" FOREIGN KEY (member) REFERENCES members(id) ON DELETE CASCADE
TABLE "switch_members" CONSTRAINT "switch_members_member_fkey" FOREIGN KEY (member) REFERENCES members(id) ON DELETE CASCADE
Access method: heap
pluralkit=# SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='members';
-[ RECORD 1 ]--+--------------
relname | members
relpages | 816042
reltuples | 1.3412575e+07
relallvisible | 297904
relkind | r
relnatts | 29
relhassubclass | f
reloptions |
pg_table_size | 6687031296
postgresql.conf:
archive_command = '/nix/store/dz9vwg5grhvpz4v7rscdl95n7s5z8c3x-pluralkit-scripts-1/bin/pk-walg data 5432 pluralkit wal-push %p'
archive_mode = 'yes'
archive_timeout = 60
effective_cache_size = '40GB'
hba_file = '/nix/store/0pihm0qzk2pdn1d344p8kihpa58hd7s6-pg_hba.conf'
listen_addresses = 'localhost, 100.123.150.13, fdaa:9:e856:a7b:910f:0:a:102'
log_destination = 'stderr'
log_line_prefix = 'pluralkit-db-data'
log_min_duration_statement = 1000
max_connections = 2000
max_wal_size = '1GB'
min_wal_size = '80MB'
port = 5432
shared_buffers = '50GB'
Environment:
PostgreSQL 18.1 on x86_64-pc-linux-gnu, compiled by clang version 20.1.8, 64-bit
installed from nixpkgs
NixOS 25.11 on Linux 6.12.61 on AMD EPYC 7351P
2x Intel SSDPELKX020T7 in RAID1 (md-raid), with LUKS, ext4 filesystem
Same issue previously appeared on
PostgreSQL 14, both self-compiled and from nixpkgs
NixOS 24.10 on AMD Ryzen 7950X
ext4 filesystem directly on single Kioxia enterprise SSD (sorry I don't remember the model)
Kindly,
Lillian
pgsql-performance by date: