Thread: 1 Status of vertical clustered index - 2 Join using (fk_constraint)suggestion - 3 Status of pgsql's parser autonomization

1 Status of vertical clustered index - 2 Join using (fk_constraint)suggestion - 3 Status of pgsql's parser autonomization

From

maxzor

Date:

16 February 2020, 21:38:29

Hello,

1. I was told that M$ SQLServer provides huge performance deltas over PostgreSQL when dealing with index-unaligned queries :
create index i on t (a,b, c);
select * from t where b=... and c=...;
Columnar storage has been tried by various companies, CitusData, EnterpriseDB, 2ndQuadrant, Fujitsu Enterprise Postgres. It has been discussed quite a lot, last thread that I was able to find being in 2017, https://www.postgresql.org/message-id/CAJrrPGfaC7WC9NK6PTTy6YN-NN%2BhCy8xOLAh2doYhVg5d6HsAA%40mail.gmail.com where Fujitsu's patch made it quite far.
What is the status on such a storage manager extension interface ?

2. What do you think of adding a new syntax : 'from t join t2 using (fk_constraint)' ? And further graph algorithms to make automatic joins ?
Both 'natural join' and 'using (column_name)' are useless when the columns are not the same in source and destination.
Plus it is often the case that the fk_constraints are over numerous columns, even though this is usually advised against. But when this case happens there will be a significant writing speedup.
I have been bothered by this to the point that I developed a graphical-query-builder plugin for pgModeler,
https://github.com/maxzor/plugins/tree/master/graphicalquerybuilder#automatic-join-mode ,
but I believe such a syntax would be much better in the core!

3. What is the status of making the internal parser of PostgreSQL less coupled to the core, and easier to cherry-pick from outside?
It would be great to incorporate it into companion projects : pgAdmin4, pgModeler, pgFormatter...

BR, Maxime Chambonnet

Re: 1 Status of vertical clustered index - 2 Join using(fk_constraint) suggestion - 3 Status of pgsql's parser autonomization

From

Tomas Vondra

Date:

17 February 2020, 01:40:46

On Sun, Feb 16, 2020 at 10:38:29PM +0100, maxzor wrote:
>Hello,
>
>1. I was told that M$ SQLServer provides huge performance deltas over 
>PostgreSQL when dealing with index-unaligned queries :
>create index i on t (a,b, c);
>select * from t where b=... and c=...;

Perhaps index-only scans might help here, but that generally does not
work for "SELECT *" queries.

>Columnar storage has been tried by various companies, CitusData, 
>EnterpriseDB, 2ndQuadrant, Fujitsu Enterprise Postgres. It has been 
>discussed quite a lot, last thread that I was able to find being in 
>2017, https://www.postgresql.org/message-id/CAJrrPGfaC7WC9NK6PTTy6YN-NN%2BhCy8xOLAh2doYhVg5d6HsAA%40mail.gmail.com 
>where Fujitsu's patch made it quite far.
>What is the status on such a storage manager extension interface ?
>

I think you're looking for threads about zheap and (especially)
zedstore. Those are two "storage manager" implementations various people
are currently working on. Neither of those is likely to make it into
pg13, though :-(

>2. What do you think of adding a new syntax : 'from t join t2 using 
>(fk_constraint)' ? And further graph algorithms to make automatic 
>joins ?
>Both 'natural join' and 'using (column_name)' are useless when the 
>columns are not the same in source and destination.
>Plus it is often the case that the fk_constraints are over numerous 
>columns, even though this is usually advised against. But when this 
>case happens there will be a significant writing speedup.

I'm not really sure what's the point / benefit here. Initially it seemed
you simply propose a syntax saying "do a join using the columns in the
FK constraint" but it's unclear to me how this implies any writing
speedup? 

>I have been bothered by this to the point that I developed a 
>graphical-query-builder plugin for pgModeler,
>https://github.com/maxzor/plugins/tree/master/graphicalquerybuilder#automatic-join-mode 
>,
>but I believe such a syntax would be much better in the core!
>

Hm, maybe.

>3. What is the status of making the internal parser of PostgreSQL less 
>coupled to the core, and easier to cherry-pick from outside?
>It would be great to incorporate it into companion projects : 
>pgAdmin4, pgModeler, pgFormatter...
>

I have no idea what you mean by "less coupled" here. What are the
requirements / use cases you're thinking about?

FWIW I think it's pretty bad idea to post questions about three very
different topics into a single pgsql-hackers thread. That'll just lead
to a lot of confusion.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: 1 Status of vertical clustered index - 2 Join using(fk_constraint) suggestion - 3 Status of pgsql's parser autonomization

From

Tomas Vondra

Date:

17 February 2020, 01:40:46

On Sun, Feb 16, 2020 at 10:38:29PM +0100, maxzor wrote:
>Hello,
>
>1. I was told that M$ SQLServer provides huge performance deltas over 
>PostgreSQL when dealing with index-unaligned queries :
>create index i on t (a,b, c);
>select * from t where b=... and c=...;

Perhaps index-only scans might help here, but that generally does not
work for "SELECT *" queries.

>Columnar storage has been tried by various companies, CitusData, 
>EnterpriseDB, 2ndQuadrant, Fujitsu Enterprise Postgres. It has been 
>discussed quite a lot, last thread that I was able to find being in 
>2017, https://www.postgresql.org/message-id/CAJrrPGfaC7WC9NK6PTTy6YN-NN%2BhCy8xOLAh2doYhVg5d6HsAA%40mail.gmail.com 
>where Fujitsu's patch made it quite far.
>What is the status on such a storage manager extension interface ?
>

I think you're looking for threads about zheap and (especially)
zedstore. Those are two "storage manager" implementations various people
are currently working on. Neither of those is likely to make it into
pg13, though :-(

>2. What do you think of adding a new syntax : 'from t join t2 using 
>(fk_constraint)' ? And further graph algorithms to make automatic 
>joins ?
>Both 'natural join' and 'using (column_name)' are useless when the 
>columns are not the same in source and destination.
>Plus it is often the case that the fk_constraints are over numerous 
>columns, even though this is usually advised against. But when this 
>case happens there will be a significant writing speedup.

I'm not really sure what's the point / benefit here. Initially it seemed
you simply propose a syntax saying "do a join using the columns in the
FK constraint" but it's unclear to me how this implies any writing
speedup? 

>I have been bothered by this to the point that I developed a 
>graphical-query-builder plugin for pgModeler,
>https://github.com/maxzor/plugins/tree/master/graphicalquerybuilder#automatic-join-mode 
>,
>but I believe such a syntax would be much better in the core!
>

Hm, maybe.

>3. What is the status of making the internal parser of PostgreSQL less 
>coupled to the core, and easier to cherry-pick from outside?
>It would be great to incorporate it into companion projects : 
>pgAdmin4, pgModeler, pgFormatter...
>

I have no idea what you mean by "less coupled" here. What are the
requirements / use cases you're thinking about?

FWIW I think it's pretty bad idea to post questions about three very
different topics into a single pgsql-hackers thread. That'll just lead
to a lot of confusion.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: 1 Status of vertical clustered index - 2 Join using(fk_constraint) suggestion - 3 Status of pgsql's parser autonomization

From

maxzor

Date:

17 February 2020, 01:56:52

> ...
Thank you will look into it!
> I'm not really sure what's the point / benefit here. Initially it seemed
> you simply propose a syntax saying "do a join using the columns in the
> FK constraint" but it's unclear to me how this implies any writing
> speedup?
This is exactly what I mean. If you know the fk_constraint (usually 
there are simple patterns) you are all set, or else... you could use a 
function fk(t, t2) to lookup pg_constraint, or even better / more bloat, 
have psql do autocompletion for you? Corner case multiple fks between t 
and t2.
'from t join t2 using(fk(t,t2))'
> I have no idea what you mean by "less coupled" here. What are the
> requirements / use cases you're thinking about?
A lot of external tools do query parsing or validation, I wish they 
could use the official parser as a dependency. AFAIK it is currently not 
the case and everyone is re-implementing its subpar parser.
> FWIW I think it's pretty bad idea to post questions about three very
> different topics into a single pgsql-hackers thread. That'll just lead
> to a lot of confusion.

Right... I figured as a newcomer I would not spam the mailing list.
Best regards

Re: 1 Status of vertical clustered index - 2 Join using(fk_constraint) suggestion - 3 Status of pgsql's parser autonomization

From

maxzor

Date:

17 February 2020, 01:56:52

> ...
Thank you will look into it!
> I'm not really sure what's the point / benefit here. Initially it seemed
> you simply propose a syntax saying "do a join using the columns in the
> FK constraint" but it's unclear to me how this implies any writing
> speedup?
This is exactly what I mean. If you know the fk_constraint (usually 
there are simple patterns) you are all set, or else... you could use a 
function fk(t, t2) to lookup pg_constraint, or even better / more bloat, 
have psql do autocompletion for you? Corner case multiple fks between t 
and t2.
'from t join t2 using(fk(t,t2))'
> I have no idea what you mean by "less coupled" here. What are the
> requirements / use cases you're thinking about?
A lot of external tools do query parsing or validation, I wish they 
could use the official parser as a dependency. AFAIK it is currently not 
the case and everyone is re-implementing its subpar parser.
> FWIW I think it's pretty bad idea to post questions about three very
> different topics into a single pgsql-hackers thread. That'll just lead
> to a lot of confusion.

Right... I figured as a newcomer I would not spam the mailing list.
Best regards