Why is this query touching 4gb of buffers? - Mailing list pgsql-general
| From | hubert depesz lubaczewski |
|---|---|
| Subject | Why is this query touching 4gb of buffers? |
| Date | |
| Msg-id | aPtcnI2BZkeHjRvF@depesz.com Whole thread Raw |
| Responses |
Re: Why is this query touching 4gb of buffers?
|
| List | pgsql-general |
Hi,
I have weird-ish case, that I can't grok, or at least explain in
hand-wavy way.
Very simple query:
SELECT
some_table.communication_channel_id,
some_table.root_account_id
FROM
some_schema.some_table
WHERE
workflow_state = 'pending' AND
send_at <= '2025-10-23 12:01:13';
On complex-ish table:
Table "some_schema.some_table"
Column │ Type │ Collation │ Nullable │
Default
═════════════════════════════════╪═════════════════════════════╪═══════════╪══════════╪═══════════════════════════════════════════════════════════════════
id │ bigint │ │ not null │
nextval('some_schema.some_table_id_seq'::regclass)
notification_id │ bigint │ │ │
notification_policy_id │ bigint │ │ │
context_id │ bigint │ │ │
context_type │ character varying(255) │ │ │
communication_channel_id │ bigint │ │ │
frequency │ character varying(255) │ │ │
workflow_state │ character varying(255) │ │ │
batched_at │ timestamp without time zone │ │ │
created_at │ timestamp without time zone │ │ │
updated_at │ timestamp without time zone │ │ │
send_at │ timestamp without time zone │ │ │
link │ text │ │ │
name_of_topic │ text │ │ │
summary │ text │ │ │
root_account_id │ bigint │ │ │
notification_policy_override_id │ bigint │ │ │
Indexes:
"some_table_pkey" PRIMARY KEY, btree (id), tablespace "data1"
…
"index_some_table_pending" btree (send_at) WHERE workflow_state::text = 'pending'::text, tablespace "data1"
…
Running this via explain shows:
Index Scan using index_some_table_pending on some_table (cost=0.43..399992.44 rows=2215063 width=16) (actual
time=382.466..382.466rows=0 loops=1)
Index Cond: (send_at <= '2025-10-23 12:01:13'::timestamp without time zone)
Buffers: shared hit=554347
Planning Time: 0.082 ms
Execution Time: 382.482 ms
(5 rows)
Specifically, I see that it returned 0 rows, and to do it, it had to "touch" 554k buffers - ~ 4GB of data.
Sizes of the relavant things:
oid │ relkind │ pg_relation_size │ pg_table_size │ pg_total_relation_size
══════════════════════════════════════╪═════════╪══════════════════╪═══════════════╪════════════════════════
some_schema.some_table │ r │ 15339020288 │ 15376539648 │ 22347988992
some_schema.index_some_table_pending │ i │ 77127680 │ 77168640 │ 77168640
(2 rows)
I didn't check analyze time, so can't comment on this, but I did reindex of
this index, which shrunk it to 16973824 bytes, and now the query runs, as expected, in < 1ms:
Index Scan using index_some_table_pending on some_table (cost=0.43..392423.37 rows=2215272 width=16) (actual
time=0.005..0.005rows=0 loops=1)
Index Cond: (send_at <= '2025-10-23 12:01:13'::timestamp without time zone)
Buffers: shared hit=3
Planning:
Buffers: shared hit=3
Planning Time: 0.787 ms
Execution Time: 0.016 ms
(7 rows)
While talking with others, there have been used certain terms, like
"unbounded range", which I understand, but I still don't see why, with
the same stats, and NO rows returned by index - before index, pg has to
do stuff to 4GB of data?! Anyone could try to explain?
If it matters, this is rather old Pg: PostgreSQL 14.7, and the query was
running on streaming replica, but results on primary were basically the
same.
Best regards,
depesz
pgsql-general by date: