Re: [WIP]Vertical Clustered Index (columnar store extension) - take2 - Mailing list pgsql-hackers

From Peter Smith
Subject Re: [WIP]Vertical Clustered Index (columnar store extension) - take2
Date
Msg-id CAHut+Pub8PatU46FjcFN+DaREHhfHmoG7ugxiiJyfa40wKStBA@mail.gmail.com
Whole thread Raw
In response to Re: [WIP]Vertical Clustered Index (columnar store extension) - take2  (Japin Li <japinli@hotmail.com>)
List pgsql-hackers
On Fri, Jul 25, 2025 at 1:58 PM Japin Li <japinli@hotmail.com> wrote:
>
> On Wed, 23 Jul 2025 at 14:07, Peter Smith <smithpb2250@gmail.com> wrote:
> > On Tue, Jul 22, 2025 at 8:12 PM Japin Li <japinli@hotmail.com> wrote:
> > ...
...
> >>
> >> Or is it by design that users are unable to read the internal relations?
> >>
> >
> > IIUC, those VCI internal relations (implemented as materialized views)
> > are entirely managed by VCI logic. Users are not required to be aware
> > of them, and they definitely are not meant to tamper with them.
> >
>
> Thanks for your explanation!
>
> > The REFRESH that you attempted should have caused a more graceful error, like:
> > ERROR:  extension "vci" prohibits this operation on view
> > "vci_0000016482_00000_d"
> > So, thanks for reporting that the ERROR failed. Investigating...
>
> I'm considering storing this metadata in heap tables, as Citus Columnar [1]
> and TimescaleDB [2] also utilize them for some metadata. Is this a sound
> approach? I'm wondering if this is a suitable strategy for VCI?
>
> [1] https://github.com/citusdata/citus/blob/main/src/backend/columnar/columnar_metadata.c#L174
> [2] https://github.com/timescale/timescaledb/blob/main/src/chunk.c#L151
>

Hi Japin,

TL;DR;
------
What metadata relations did you have in mind to change, how do you
want to change them, and what is the main motivation?


Metadata
--------
What relations did you mean when you referred to "metadata"? There are
a number of VCI internal relations serving different purposes -- WOS,
NULL vectors, ROS Columnar Data, etc. That relation
("vci_0000016482_00000_d") is not exactly what I would call metadata;
that relation is for VCI ROS data of column 0 (the 'd' suffix means
"data"), so it is real data that VCI maintains in the ROS and keeps in
synch with the original data of the table being indexed.

The relations that VCI considers as "metadata" have an 'm' suffix. e.g.

+/**
+ * Meta Relation
+ */
+#define VCI_RELTYPE_META     ('m')
+

All "metadata" files can be seen below with VCI_RELTYPE_META suffix.
Are these the files you had in mind to change?

+ /* TID */
+ oid = vci_create_relation(GenRelName(indexRel, VCI_COLUMN_ID_TID,
VCI_RELTYPE_DATA), indexRel, indexInfo, VCI_RELTYPE_ROS);
+ vci_SetMainRelVar(vmr_info, vcimrv_tid_data_oid, 0, oid);
+ oid = vci_create_relation(GenRelName(indexRel, VCI_COLUMN_ID_TID,
VCI_RELTYPE_META), indexRel, indexInfo, VCI_RELTYPE_ROS);
+ vci_SetMainRelVar(vmr_info, vcimrv_tid_meta_oid, 0, oid);
+
+ /* NUll */
+ oid = vci_create_relation(GenRelName(indexRel, VCI_COLUMN_ID_NULL,
VCI_RELTYPE_DATA), indexRel, indexInfo, VCI_RELTYPE_ROS);
+ vci_SetMainRelVar(vmr_info, vcimrv_null_data_oid, 0, oid);
+ oid = vci_create_relation(GenRelName(indexRel, VCI_COLUMN_ID_NULL,
VCI_RELTYPE_META), indexRel, indexInfo, VCI_RELTYPE_ROS);
+ vci_SetMainRelVar(vmr_info, vcimrv_null_meta_oid, 0, oid);
+
+ /* Delete Vector */
+ oid = vci_create_relation(GenRelName(indexRel, VCI_COLUMN_ID_DELETE,
VCI_RELTYPE_DATA), indexRel, indexInfo, VCI_RELTYPE_ROS);
+ vci_SetMainRelVar(vmr_info, vcimrv_delete_data_oid, 0, oid);
+ oid = vci_create_relation(GenRelName(indexRel, VCI_COLUMN_ID_DELETE,
VCI_RELTYPE_META), indexRel, indexInfo, VCI_RELTYPE_ROS);
+ vci_SetMainRelVar(vmr_info, vcimrv_delete_meta_oid, 0, oid);

...

+ vci_SetMainRelVar(vmr_info, vcimrv_num_columns, 0, tupdesc->natts);
+ for (i = 0; i < tupdesc->natts; i++)
+ {
+ Oid column_store_oid;
+ Oid column_meta_oid;
+ vcis_m_column_t *columnPointer;
+
+ column_store_oid = vci_create_relation(GenRelName(indexRel, i,
VCI_RELTYPE_DATA), indexRel, indexInfo, VCI_RELTYPE_ROS);
+ column_meta_oid = vci_create_relation(GenRelName(indexRel, i,
VCI_RELTYPE_META), indexRel, indexInfo, VCI_RELTYPE_ROS);


Heap Tables
-----------
When you say "storing ... in heap tables", I assumed you meant that
some of the metadata files could be stored as normal row-based
PostgreSQL files -- e.g. like PG catalogs, right?

AFAIK the current VCI internal relations are implemented with a column
of byte arrays that get mapped to VCI data structures in memory --
e.g. the Extents and Offsets and Bit Vectors etc (as described in the
README) are all stored in binary form. This arrangement also helps for
manipulating Extents for compression/defragmentation, and for Delete
Vector garbage collection and CRID renumbering.

Can you please elaborate what is your goal of changing some of these
relations to be normal PostgreSQL row-based tables? e.g. What problems
do you have in mind that using the heap tables will solve?

If your idea was mainly to assist debugging, then for this purpose I
believe that the original VCI implementation (in the commercial
product) had a separate "vci inspection" tool capable of dumping the
content of the VCI internal relations in a human-readable format. But,
we did not bring this tool across to OSS. Users are not supposed to be
aware of these VCI internal relations, and they cannot change them, so
even if they could see the contents of those heap tables it might
appear meaningless.


Current Focus
--------------
We know that the VCI patches still need more work before they might be
acceptable to the OSS community. Currently, we are focusing on
simplifying the existing VCI patches and fixing bugs. We also need to
make the PostgreSQL core interface less intrusive, and add lots more
test cases.

Changing any VCI internal relations to be heap tables may involve lots
of code rewrite, so even if it can be shown advantageous to do it, I'd
prefer to put this kind of work on a TODO list for later -- e.g. it
might be premature to redesign the current logic before all the other
necessary stability work is done first.

======
Kind Regards,
Peter Smith.
Fujitsu Australia.



pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
Next
From: Alexander Lakhin
Date:
Subject: Re: Test instability when pg_dump orders by OID