Thread: Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

On Thu, 10 Jul 2025 at 17:51, Peter Smith <smithpb2250@gmail.com> wrote:
> On Thu, Jul 10, 2025 at 4:07 PM Japin Li <japinli@hotmail.com> wrote:
> ...
>> I'm still trying to understand the patches.
>>
>> diff --git a/src/include/storage/itemptr.h b/src/include/storage/itemptr.h
>> index 74b87a9..d97d1c5 100644
>> --- a/src/include/storage/itemptr.h
>> +++ b/src/include/storage/itemptr.h
>> @@ -46,6 +46,9 @@ typedef struct ItemPointerData
>>  #endif
>>  ItemPointerData;
>>
>> +#define SizeOfIptrData \
>> +               (offsetof(ItemPointerData, ip_posid) + sizeof(OffsetNumber))
>> +
>>
>> I've noticed this macro is currently defined within core; however, I found it only
>> used in the VCI extension.
>>
>> Could you clarify the rationale for its inclusion in the core, and whether it's
>> genuinely required there, or if it would be better suited within the extension
>> itself?
>
> Right, this had previously also been reported by Tomas [1].
>
> Upon investigation, I found that this was master code from 10 years
> ago (back when this VCI patch was implemented). The master code has
> moved on since then and removed this macro [2], but this VCI patch did
> not...
>
> I'll try to address this for the next patchset.
>
> ======
> [1] https://www.postgresql.org/message-id/a748aa6b-c7e6-4d02-a590-ab404d590448%40vondra.me
> [2] https://github.com/postgres/postgres/commit/8023b5827fbada6815ce269db4f3373ac77ec7c3
>

My apologies, I wasn't following the full email thread.

--
Regards,
Japin Li



Thank You for working on this. I started going through the README and
tried running simple tests, have few concerns:

1)
I am not able to understand section 4.2 'WOS-to-ROS conversion'. When
whiteout-WOS says 'delete 4', what does that mean? 4 is CRID, TXID?
And when does delete-vector X represents? I did not  get why ColA-2,
ColA-4 and ColB-2, ColB-4 were removed in resultant data? Is the
diagram complete?

2)
We can make the definition consistent at both places as the first one
gives a feeling that rows are marked for deletion in WOS while the
second one says ROS.

Whiteout WOS = Record of WOS rows marked for deletion
Whiteout WOS -- TID records of WOS rows that are marked for deletion on ROS

3)
It is not part of README. But please help me understand the meaning
and usage of this GUC in VCI context:
vci.max_devices: Sets the maximum device number which can be attached.

4)
I was trying my hands on using VCI, I found that on enabling the VCI
path, autovacuum asserts in between. We can reproduce the scenario
using:
create extension vci;
create table tab1( i int, j int, k int);
create index ivci on tab1 using vci (i);
insert into tab1 SELECT generate_series(1,1000),
generate_series(1,1000), generate_series(1,1000);
insert into tab1 SELECT generate_series(1,1000),
generate_series(1,1000), generate_series(1,1000);
SET enable_seqscan TO off;
VACUUM ANALYZE public.tab1;

It asserts here:
TRAP: failed Assert("TransactionIdIsNormal(xmax)"), File: "heapam.c",
Line: 7373, PID: 79834
postgres: autovacuum worker postgres(ExceptionalCondition+0xbb)[0x55fdf8f3ccc6]
postgres: autovacuum worker
postgres(heap_pre_freeze_checks+0x18f)[0x55fdf877e26f]
postgres: autovacuum worker postgres(index_vacuum_cleanup+0x181)[0x55fdf87a1c06]
postgres: autovacuum worker postgres(vac_cleanup_one_index+0x27)[0x55fdf8a0503a]

-------------

Few typos in README:
a) Each VCI indexed column is stored as an internal relations.
--relations --> relation

b) Records are addresses by CRID (Columnar Record ID) instead of by TID.
--addresses->addressed

c) Extents can be found by ID using offsets in a column "meta-data"
internal relation.
-- by ID using offsets? Do you mean 'by using offsets' alone?

d) EXPLAIN ANALYSE -->EXPLAIN ANALYZE


thanks
Shveta



On Fri, Jul 11, 2025 at 1:46 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> Thank You for working on this. I started going through the README and
> tried running simple tests, have few concerns:
>
> 1)
> I am not able to understand section 4.2 'WOS-to-ROS conversion'. When
> whiteout-WOS says 'delete 4', what does that mean? 4 is CRID, TXID?
> And when does delete-vector X represents? I did not  get why ColA-2,
> ColA-4 and ColB-2, ColB-4 were removed in resultant data? Is the
> diagram complete?
>
> 2)
> We can make the definition consistent at both places as the first one
> gives a feeling that rows are marked for deletion in WOS while the
> second one says ROS.
>
> Whiteout WOS = Record of WOS rows marked for deletion
> Whiteout WOS -- TID records of WOS rows that are marked for deletion on ROS
>
> 3)
> It is not part of README. But please help me understand the meaning
> and usage of this GUC in VCI context:
> vci.max_devices: Sets the maximum device number which can be attached.
>
> 4)
> I was trying my hands on using VCI, I found that on enabling the VCI
> path, autovacuum asserts in between. We can reproduce the scenario
> using:
> create extension vci;
> create table tab1( i int, j int, k int);
> create index ivci on tab1 using vci (i);
> insert into tab1 SELECT generate_series(1,1000),
> generate_series(1,1000), generate_series(1,1000);
> insert into tab1 SELECT generate_series(1,1000),
> generate_series(1,1000), generate_series(1,1000);
> SET enable_seqscan TO off;
> VACUUM ANALYZE public.tab1;
>
> It asserts here:
> TRAP: failed Assert("TransactionIdIsNormal(xmax)"), File: "heapam.c",
> Line: 7373, PID: 79834
> postgres: autovacuum worker postgres(ExceptionalCondition+0xbb)[0x55fdf8f3ccc6]
> postgres: autovacuum worker
> postgres(heap_pre_freeze_checks+0x18f)[0x55fdf877e26f]
> postgres: autovacuum worker postgres(index_vacuum_cleanup+0x181)[0x55fdf87a1c06]
> postgres: autovacuum worker postgres(vac_cleanup_one_index+0x27)[0x55fdf8a0503a]
>
> -------------
>
> Few typos in README:
> a) Each VCI indexed column is stored as an internal relations.
> --relations --> relation
>
> b) Records are addresses by CRID (Columnar Record ID) instead of by TID.
> --addresses->addressed
>
> c) Extents can be found by ID using offsets in a column "meta-data"
> internal relation.
> -- by ID using offsets? Do you mean 'by using offsets' alone?
>
> d) EXPLAIN ANALYSE -->EXPLAIN ANALYZE
>
>

Hi Shveta.

Thanks for your README questions and typo reports. I will try to
address those soon.

Regarding your reported VACUUM exception, I believe this is the same
issue that was previously reported by Japin Li [1].

======
[1]
https://www.postgresql.org/message-id/SY8P300MB0442BEC3F5CF432F0121ACC4B642A%40SY8P300MB0442.AUSP300.PROD.OUTLOOK.COM

Kind Regards,
Peter Smith.
Fujitsu Australia