Thread: pgvector as standard PostgreSQL feature?

pgvector as standard PostgreSQL feature?

From
Sebastien Flaesch
Date:
Hello,

I am looking at pgvector, pgvectorscale, pgai  extensions.

Other DB engines support built-in vector types.

Is there a plan to get pgvector's types (vector, halfvec, sparsevec, bit) implemented as native built-in data types like json/jsonb ?

Side note: I have some doubts about these type names, especially "bit" ... why not "bitvec"?

Seb

Re: pgvector as standard PostgreSQL feature?

From
Christophe Pettus
Date:

> On Mar 19, 2025, at 07:47, Sebastien Flaesch <sebastien.flaesch@4js.com> wrote:
>
> Is there a plan to get pgvector's types (vector, halfvec, sparsevec, bit) implemented as native built-in data types
likejson/jsonb ? 

(I'm speaking just for myself here.)  I would not base any plans on this functionality being available in the
PostgreSQLcore in the near future (and by "near future," I mean the next five years). 

1. You list three different extensions with overlapping functionality, and that's a good sign that there isn't
consensuson what the features that would be offered in core should be. 

2. Adding a type to the core distribution (or even to contrib/) creates a maintenance burden on the core developers,
andthat's not something assumed lightly.  Once a type is in core, it (almost) never can be removed, and the more
specializedthe type and detailed the implementation, the greater the risk that the developers who know and care about
itwon't be available in the future.  Search the archives for a discussion of the "money" type for what happens when a
typeadded to core starts becoming ill-supported... and "money" isn't anywhere near as complex as vector functionality. 

3. PostgreSQL is designed to have a rich ecosystem of extensions.  The ability to add this kind of functionality in an
extensionis exactly what distinguishes PostgreSQL from many other RDBMS systems.  There's no burning need to add
functionalitylike this to core. 

It is true that hosted environments take time to adopt new extensions (although AWS RDS has supported pgvector for
nearlytwo years now), but that's not in itself a reason to move things into core. 

> Side note: I have some doubts about these type names, especially "bit" ... why not "bitvec"?

BIT and BIT VARYING are the SQL standard names for these types.






Re: pgvector as standard PostgreSQL feature?

From
Sebastien Flaesch
Date:

Go it, makes total sense.

So pgvector etc will probably remain an extension for a while.

Thanks for the note about BIT type.
I have missed that it's a standard built-in type.

Seb

From: Christophe Pettus <xof@thebuild.com>
Sent: Wednesday, March 19, 2025 9:19 AM
To: Sebastien Flaesch <sebastien.flaesch@4js.com>
Cc: pgsql-general@postgresql.org <pgsql-general@postgresql.org>
Subject: Re: pgvector as standard PostgreSQL feature?
 
EXTERNAL: Do not click links or open attachments if you do not recognize the sender.

> On Mar 19, 2025, at 07:47, Sebastien Flaesch <sebastien.flaesch@4js.com> wrote:
>
> Is there a plan to get pgvector's types (vector, halfvec, sparsevec, bit) implemented as native built-in data types like json/jsonb ?

(I'm speaking just for myself here.)  I would not base any plans on this functionality being available in the PostgreSQL core in the near future (and by "near future," I mean the next five years).

1. You list three different extensions with overlapping functionality, and that's a good sign that there isn't consensus on what the features that would be offered in core should be.

2. Adding a type to the core distribution (or even to contrib/) creates a maintenance burden on the core developers, and that's not something assumed lightly.  Once a type is in core, it (almost) never can be removed, and the more specialized the type and detailed the implementation, the greater the risk that the developers who know and care about it won't be available in the future.  Search the archives for a discussion of the "money" type for what happens when a type added to core starts becoming ill-supported... and "money" isn't anywhere near as complex as vector functionality.

3. PostgreSQL is designed to have a rich ecosystem of extensions.  The ability to add this kind of functionality in an extension is exactly what distinguishes PostgreSQL from many other RDBMS systems.  There's no burning need to add functionality like this to core.

It is true that hosted environments take time to adopt new extensions (although AWS RDS has supported pgvector for nearly two years now), but that's not in itself a reason to move things into core.

> Side note: I have some doubts about these type names, especially "bit" ... why not "bitvec"?

BIT and BIT VARYING are the SQL standard names for these types.