Home > mailing lists

Re: Increasing IndexTupleData.t_info from uint16 to uint32 - Mailing list pgsql-hackers

From	Aleksander Alekseev
Subject	Re: Increasing IndexTupleData.t_info from uint16 to uint32
Date	January 19 13:40:31
Msg-id	CAJ7c6TPntu8+KUqM6SKjfFwvcCNoCdTS8sPHL3xfvv83rf+CDA@mail.gmail.com Whole thread Raw
In response to	Re: Increasing IndexTupleData.t_info from uint16 to uint32 (Matthias van de Meent <boekewurm+postgres@gmail.com>)
List	pgsql-hackers

Tree view

Hi,

> > The overall trend in machine learning embedding sizes has been growing rapidly over the last few years from 128 up
to4K dimensions yielding additional value and quality improvements. It's not clear when this trend in growth will ease.
Theleading text embedding models generate now exceeds the index storage available in IndexTupleData.t_info. 
> >
> > The current index tuple size is stored in 13 bits of IndexTupleData.t_info, which limits the max size of an index
tupleto 2^13 = 8129 bytes. Vectors implemented by pgvector currently use a 32 bit float for elements, which limits
vectorsize to 2K dimensions, which is no longer state of the art. 
> >
> > I've attached a patch that increases  IndexTupleData.t_info from 16bits to 32bits allowing for significantly larger
indextuple sizes. I would guess this patch is not a complete implementation that allows for migration from previous
versions,but it does compile and initdb succeeds. I'd be happy to continue work if the core team is receptive to an
updatein this area, and I'd appreciate any feedback the community has on the approach. 

If I read this correctly, basically the patch adds 16 useless bits for
all applications except for ML ones...

Perhaps implementing an alternative storage specifically for ML using
TAM interface would be a better approach?

--
Best regards,
Aleksander Alekseev

pgsql-hackers by date:

From: shveta malik
Date: 19 January, 13:25:24
Subject: Re: Synchronizing slots from primary to standby

From: jian he
Date: 19 January, 13:46:09
Subject: Re: remaining sql/json patches

Re: Increasing IndexTupleData.t_info from uint16 to uint32 - Mailing list pgsql-hackers

Previous

Next