Home > mailing lists

Integrating HLL cardinality estimates with join operator estimation - Mailing list pgsql-hackers

From	Abhishek Kumar
Subject	Integrating HLL cardinality estimates with join operator estimation
Date	November 24 06:31:09
Msg-id	CAGXSBMHkzmFPYekSr64FyFLKkv2zyO8m_q8s_9tS5=MY-Mn4tQ@mail.gmail.com Whole thread Raw
List	pgsql-hackers

Tree view

Dear PostgreSQL hackers,

I am writing to seek guidance and potential collaboration on a project involving cardinality estimation improvements in PostgreSQL. The project aims to enhance join result cardinality estimation by incorporating HyperLogLog (HLL) estimates alongside the existing join operator framework.

Project Overview:

Goal: Improve the accuracy of join cardinality estimation using HLL sketches
Scope: Modify the existing join estimation logic to consider HLL-based distinct count estimates
Expected benefit: More accurate query plans for joins involving columns with high cardinality

Technical Areas of Interest:

Current implementation of join selectivity estimation in src/backend/optimizer
Integration points for HLL sketches within the existing statistics framework
Potential modifications needed to the join operator logic

Questions for the Community:

Has similar work been attempted or discussed previously?
What would be the preferred approach to integrate HLL estimates with the existing join estimation framework?
Are there specific areas of the codebase I should focus on initially?
Would this enhancement align with the project's current direction for query optimization?

I have previously worked with tweaking the BufferReplacement policy for Postgres wherein I implemented a LazyBufferReplacementPolicy using FIFO queues, swapping out the clock sweep algorithm, so I have a bit of familiarity with the Postgres codebase.

I would greatly appreciate any guidance, feedback, or suggestions from the community.
I'm happy to provide more detailed information about the proposed approach or clarify any aspects of the project.

Thank you for your time and consideration.

Best regards,
Abhishek Kumar

pgsql-hackers by date:

From: "Andrey M. Borodin"
Date: 24 November, 00:13:02
Subject: Re: Forbid to DROP temp tables of other sessions

From: Marcos Pegoraro
Date: 24 November, 17:45:14
Subject: Missing INFO on client_min_messages

Integrating HLL cardinality estimates with join operator estimation - Mailing list pgsql-hackers

Previous

Next