Hi
On Thu, Mar 6, 2025 at 5:35 AM Aleksander Alekseev
<aleksander@timescale.com> wrote:
>
> Hi Nikhil,
>
> Many thanks for working on this. I proposed a similar patch some time
> ago [1] but the overall feedback was somewhat mixed so I choose to
> focus on something else. Thanks for peeking this up.
>
> > test=# select build_zstd_dict_for_attribute('"public"."zstd"', 1);
> > build_zstd_dict_for_attribute
> > -------------------------------
> > t
> > (1 row)
>
> Did you have a chance to familiarize yourself with the corresponding
> discussion [1] and probably the previous threads? Particularly it was
> pointed out that dictionaries should be built automatically during
> VACUUM. We also discussed a special syntax for the feature, besides
> other things.
>
> [1]:
https://www.postgresql.org/message-id/flat/CAJ7c6TOtAB0z1UrksvGTStNE-herK-43bj22%3D5xVBg7S4vr5rQ%40mail.gmail.com
Restricting dictionary generation to the vacuum process is not ideal
because it limits user control and flexibility. Compression efficiency
is highly dependent on data distribution, which can change
dynamically. By allowing users to generate dictionaries on demand via
an API, they can optimize compression when they detect inefficiencies
rather than waiting for a vacuum process, which may not align with
their needs.
Additionally, since all dictionaries are stored in the catalog table
anyway, users can generate and manage them independently without
interfering with the system’s automatic maintenance tasks. This
approach ensures better adaptability to real-world scenarios where
compression performance needs to be monitored and adjusted in real
time.
---
Nikhil Veldanda