1 Status of vertical clustered index - 2 Join using (fk_constraint) suggestion - 3 Status of pgsql's parser autonomization

Started by maxzoralmost 6 years ago4 messages
#1maxzor
maxzor@maxzor.eu

Hello,

1. I was told that M$ SQLServer provides huge performance deltas over
PostgreSQL when dealing with index-unaligned queries :
create index i on t (a,b, c);
select * from t where b=... and c=...;
Columnar storage has been tried by various companies, CitusData,
EnterpriseDB, 2ndQuadrant, Fujitsu Enterprise Postgres. It has been
discussed quite a lot, last thread that I was able to find being in
2017,
/messages/by-id/CAJrrPGfaC7WC9NK6PTTy6YN-NN+hCy8xOLAh2doYhVg5d6HsAA@mail.gmail.com
where Fujitsu's patch made it quite far.
What is the status on such a storage manager extension interface ?

2. What do you think of adding a new syntax : 'from t join t2 using
(fk_constraint)' ? And further graph algorithms to make automatic joins ?
Both 'natural join' and 'using (column_name)' are useless when the
columns are not the same in source and destination.
Plus it is often the case that the fk_constraints are over numerous
columns, even though this is usually advised against. But when this case
happens there will be a significant writing speedup.
I have been bothered by this to the point that I developed a
graphical-query-builder plugin for pgModeler,
https://github.com/maxzor/plugins/tree/master/graphicalquerybuilder#automatic-join-mode
,
but I believe such a syntax would be much better in the core!

3. What is the status of making the internal parser of PostgreSQL less
coupled to the core, and easier to cherry-pick from outside?
It would be great to incorporate it into companion projects : pgAdmin4,
pgModeler, pgFormatter...

BR, Maxime Chambonnet

#2Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: maxzor (#1)
Re: 1 Status of vertical clustered index - 2 Join using (fk_constraint) suggestion - 3 Status of pgsql's parser autonomization

On Sun, Feb 16, 2020 at 10:38:29PM +0100, maxzor wrote:

Hello,

1. I was told that M$ SQLServer provides huge performance deltas over
PostgreSQL when dealing with index-unaligned queries :
create index i on t (a,b, c);
select * from t where b=... and c=...;

Perhaps index-only scans might help here, but that generally does not
work for "SELECT *" queries.

Columnar storage has been tried by various companies, CitusData,
EnterpriseDB, 2ndQuadrant, Fujitsu Enterprise Postgres. It has been
discussed quite a lot, last thread that I was able to find being in
2017, /messages/by-id/CAJrrPGfaC7WC9NK6PTTy6YN-NN+hCy8xOLAh2doYhVg5d6HsAA@mail.gmail.com
where Fujitsu's patch made it quite far.
What is the status on such a storage manager extension interface ?

I think you're looking for threads about zheap and (especially)
zedstore. Those are two "storage manager" implementations various people
are currently working on. Neither of those is likely to make it into
pg13, though :-(

2. What do you think of adding a new syntax : 'from t join t2 using
(fk_constraint)' ? And further graph algorithms to make automatic
joins ?
Both 'natural join' and 'using (column_name)' are useless when the
columns are not the same in source and destination.
Plus it is often the case that the fk_constraints are over numerous
columns, even though this is usually advised against. But when this
case happens there will be a significant writing speedup.

I'm not really sure what's the point / benefit here. Initially it seemed
you simply propose a syntax saying "do a join using the columns in the
FK constraint" but it's unclear to me how this implies any writing
speedup?

I have been bothered by this to the point that I developed a
graphical-query-builder plugin for pgModeler,
https://github.com/maxzor/plugins/tree/master/graphicalquerybuilder#automatic-join-mode
,
but I believe such a syntax would be much better in the core!

Hm, maybe.

3. What is the status of making the internal parser of PostgreSQL less
coupled to the core, and easier to cherry-pick from outside?
It would be great to incorporate it into companion projects :
pgAdmin4, pgModeler, pgFormatter...

I have no idea what you mean by "less coupled" here. What are the
requirements / use cases you're thinking about?

FWIW I think it's pretty bad idea to post questions about three very
different topics into a single pgsql-hackers thread. That'll just lead
to a lot of confusion.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3maxzor
maxzor@maxzor.eu
In reply to: Tomas Vondra (#2)
Re: 1 Status of vertical clustered index - 2 Join using (fk_constraint) suggestion - 3 Status of pgsql's parser autonomization

...

Thank you will look into it!

I'm not really sure what's the point / benefit here. Initially it seemed
you simply propose a syntax saying "do a join using the columns in the
FK constraint" but it's unclear to me how this implies any writing
speedup?

This is exactly what I mean. If you know the fk_constraint (usually
there are simple patterns) you are all set, or else... you could use a
function fk(t, t2) to lookup pg_constraint, or even better / more bloat,
have psql do autocompletion for you? Corner case multiple fks between t
and t2.
'from t join t2 using(fk(t,t2))'

I have no idea what you mean by "less coupled" here. What are the
requirements / use cases you're thinking about?

A lot of external tools do query parsing or validation, I wish they
could use the official parser as a dependency. AFAIK it is currently not
the case and everyone is re-implementing its subpar parser.

FWIW I think it's pretty bad idea to post questions about three very
different topics into a single pgsql-hackers thread. That'll just lead
to a lot of confusion.

Right... I figured as a newcomer I would not spam the mailing list.
Best regards

#4Imre Samu
pella.samu@gmail.com
In reply to: maxzor (#1)
Re: 1 Status of vertical clustered index - 2 Join using (fk_constraint) suggestion - 3 Status of pgsql's parser autonomization

3. What is the status of making the internal parser of PostgreSQL less

coupled to the core, and easier to cherry-pick from outside?

imho:
One of the current solutions is: https://github.com/lfittl/libpg_query C
library

"C library for accessing the PostgreSQL parser outside of the server.

This library uses the actual PostgreSQL server source to parse SQL queries
and return the internal PostgreSQL parse tree.Note that this is mostly
intended as a base library for

- pg_query <https://github.com/lfittl/pg_query&gt; (Ruby),

- pg_query.go <https://github.com/lfittl/pg_query.go&gt; (Go),

- pg-query-parser <https://github.com/zhm/pg-query-parser&gt; (Node),

- psqlparse <https://github.com/alculquicondor/psqlparse&gt; (Python) and

- pglast <https://pypi.org/project/pglast/&gt; (Python 3)."

"

Best,
Imre

maxzor <maxzor@maxzor.eu> ezt írta (időpont: 2020. febr. 16., V, 22:38):

Show quoted text

Hello,

1. I was told that M$ SQLServer provides huge performance deltas over
PostgreSQL when dealing with index-unaligned queries :
create index i on t (a,b, c);
select * from t where b=... and c=...;
Columnar storage has been tried by various companies, CitusData,
EnterpriseDB, 2ndQuadrant, Fujitsu Enterprise Postgres. It has been
discussed quite a lot, last thread that I was able to find being in 2017,
/messages/by-id/CAJrrPGfaC7WC9NK6PTTy6YN-NN+hCy8xOLAh2doYhVg5d6HsAA@mail.gmail.com
where Fujitsu's patch made it quite far.
What is the status on such a storage manager extension interface ?

2. What do you think of adding a new syntax : 'from t join t2 using
(fk_constraint)' ? And further graph algorithms to make automatic joins ?
Both 'natural join' and 'using (column_name)' are useless when the
columns are not the same in source and destination.
Plus it is often the case that the fk_constraints are over numerous
columns, even though this is usually advised against. But when this case
happens there will be a significant writing speedup.
I have been bothered by this to the point that I developed a
graphical-query-builder plugin for pgModeler,

https://github.com/maxzor/plugins/tree/master/graphicalquerybuilder#automatic-join-mode
,
but I believe such a syntax would be much better in the core!

3. What is the status of making the internal parser of PostgreSQL less
coupled to the core, and easier to cherry-pick from outside?
It would be great to incorporate it into companion projects : pgAdmin4,
pgModeler, pgFormatter...

BR, Maxime Chambonnet