pg_sorted_heap 0.13.0 releasedI am pleased to announce pg_sorted_heap 0.13.0, a PostgreSQL extension for physically sorted heap storage, zone-map pruning, planner-integrated vector search, and a narrow fact-shaped GraphRAG query surface. Repository: https://github.com/skuznetsov/pg_sorted_heap Release: https://github.com/skuznetsov/pg_sorted_heap/releases/tag/v0.13.0 What is pg_sorted_heap?pg_sorted_heap is a PostgreSQL extension that adds:
sorted_heap: a table access method that keeps rows physically ordered by primary key and prunes heap blocks with per-page zone maps.sorted_hnsw: a planner-integrated HNSW index access method for built-in vector types.svec and hsvec: float32 and float16 vector types for PostgreSQL.- Fact-shaped GraphRAG helpers for retrieving and reranking graph-shaped facts directly inside PostgreSQL.
The storage side is aimed at workloads where physical locality matters: time-series data, event logs, IoT readings, ordered IDs, and large tables with range predicates on primary-key columns. The vector and GraphRAG side is aimed at applications that want retrieval to stay inside PostgreSQL instead of introducing a separate vector-search sidecar. What is new in 0.13.0?The main change in 0.13.0 is that the narrow fact-shaped GraphRAG contract is now part of the stable release surface. The stable GraphRAG API includes: sorted_heap_graph_rag(...)sorted_heap_graph_register(...)sorted_heap_graph_config(...)sorted_heap_graph_unregister(...)sorted_heap_graph_rag_stats()sorted_heap_graph_rag_reset_stats()
This contract is intentionally narrow. It is designed for fact tables clustered by (entity_id, relation_id, target_id), or by an equivalent registered alias mapping. Queries start with ANN seed retrieval, expand through one or more relation hops, and then exact-rerank the expanded candidate set. Example: SELECT *
FROM sorted_heap_graph_rag( 'facts'::regclass, '[0.1,0.2,0.3,...]'::svec, relation_path := ARRAY[1, 2], ann_k := 64, top_k := 10, score_mode := 'path'
);
0.13.0 also adds a stable routed GraphRAG entry point for multi-shard or multi-tenant application flows:
sorted_heap_graph_route(...)sorted_heap_graph_route_plan(...)
This gives applications one dispatcher for exact-key routing, range routing, profiles, policies, and defaults. Other release highlights0.13.0 also includes:
- Schema registration for non-canonical fact tables, so GraphRAG can be used with existing column names.
- Backend-local GraphRAG stage stats: seed count, expanded rows, reranked rows, returned rows, and per-stage timing.
- Lifecycle hardening across extension upgrade, dump/restore, crash recovery, concurrent online compact, and concurrent online merge.
- A shared-cache correctness fix for
sorted_hnsw multi-index workloads. sorted_hnsw.build_sq8, an opt-in low-memory index-build mode for constrained builders.- An experimental FlashHadamard retrieval lane. This is documented and tested, but it is not the default ANN path and not part of the stable GraphRAG contract.
Benchmark snapshotsThese are workload-specific benchmark snapshots from the release notes, not universal performance claims. AWS Gutenberg workload, about 104K x 2880D, top-10: sorted_hnsw (svec): 1.287 ms, 100.0% Recall@10sorted_hnsw (hsvec): 1.404 ms, 100.0% Recall@10- pgvector
halfvec: 2.031 ms, 99.8% Recall@10
AWS fact-shaped multihop GraphRAG workload, 5K chains, 384D: sorted_heap_expand_twohop_path_rerank(): median 0.962 mssorted_heap_graph_rag_twohop_path_scan(): median 1.025 ms- pgvector parity row: median
1.434 ms - Qdrant parity row: median
3.355 ms
InstallationRequirements: - PostgreSQL 17 or 18
- Standard PGXS build toolchain with
pg_config in PATH
Build from source: git clone https://github.com/skuznetsov/pg_sorted_heap.git
cd pg_sorted_heap
make
make install
Enable the extension: CREATE EXTENSION pg_sorted_heap;
Upgrade an existing installation: ALTER EXTENSION pg_sorted_heap UPDATE TO '0.13.0';
Minimal examplesCreate a physically sorted table: CREATE TABLE events ( id bigint PRIMARY KEY, ts timestamptz, payload text
) USING sorted_heap;
SELECT sorted_heap_compact('events'::regclass);
Create a planner-integrated HNSW index: CREATE TABLE documents ( id bigserial PRIMARY KEY, embedding svec(384), content text
);
CREATE INDEX documents_embedding_idx
ON documents USING sorted_hnsw (embedding)
WITH (m = 16, ef_construction = 200);
SET sorted_hnsw.ef_search = 96;
SELECT id, content
FROM documents
ORDER BY embedding <=> '[0.1,0.2,0.3,...]'::svec
LIMIT 10;
Create a stable fact-shaped GraphRAG table: CREATE TABLE facts ( entity_id int4, relation_id int2, target_id int4, embedding svec(384), payload text, PRIMARY KEY (entity_id, relation_id, target_id)
) USING sorted_heap;
CREATE INDEX facts_embedding_idx
ON facts USING sorted_hnsw (embedding)
WITH (m = 24, ef_construction = 200);
VerificationThe 0.13.0 release-candidate checks include: - SQL regression coverage for
pg_sorted_heap, sorted_hnsw, and graph_rag - GraphRAG lifecycle coverage for upgrade and dump/restore
- crash recovery checks
- concurrent online-operation checks
pg_upgrade coverage from PostgreSQL 17 to 18sorted_hnsw chunked/shared-cache integration checks
For local verification: make test-release
For the narrower GraphRAG release bundle: make test-graphrag-release
Links- Repository: https://github.com/skuznetsov/pg_sorted_heap
- Documentation: https://skuznetsov.github.io/pg_sorted_heap/
- Release notes: https://github.com/skuznetsov/pg_sorted_heap/releases/tag/v0.13.0
- Issues: https://github.com/skuznetsov/pg_sorted_heap/issues
|