GSoC 2017: Foreign Key Arrays

Started by Mark Rofailabout 9 years ago202 messageshackers

markm.rofail@gmail.com

about 9 years ago

Dear PostgreSQL hacker community,

I am working on Foreign Key Arrays as part of the Google Summer of Code
2017.

I will be logging my progress on this thread as I progress, twice a week
(Mondays and Fridays), so anyone who is willing to comment, please do.

*The Problem*
Foreign Key Arrays were introduced by Marco Nenciarini[1]/messages/by-id/1343842863.5162.4.camel@greygoo.devise-it.lan, however, the
proposed patch had some performance issues. Let's assume we have two
tables, table B has a foreign key array that references table A, any change
in table A (INSERT, UPDATE, DELETE) would trigger a referential
integrity check on table B. The current implementation uses sequential
scans to accomplish this. This limits the size of tables using Foreign Key
Arrays to ~100 records which is not practical in real life applications.

*The Proposed Solution*
Ultimately, as proposed by Tom Lane[2]/messages/by-id/28389.1351094795@sss.pgh.pa.us, we would like to replace the
sequential scan with a GIN-indexed scan which would greatly enhance the
performance.

To achieve this, introducing a number of new operators is required.
However, for the scope of the project, we will focus on the most basic case
where the Primary Keys are of pseudo-type anyelement and the Foreign Keys
are of pseudo-type anyarray, thus the main operator of concern will be
@>(anyarray,anyelement).

*Progress So Far*
The actual coding begins on 30th of May, till then I will use my time to
research, to settle the technical details of my plan.

- Collected resources about GIN indexing
- http://www.sigaev.ru/gin/README.txt
- https://wiki.postgresql.org/wiki/GIN_generalization
- src\backend\access\gin\README in the repo

- Cloned the git repo found @ https://github.com/postgres/postgres and
identified the main two files I will be concerned with. (I know I may need
to edit other files but these seem to where I will spend most of my summer)
- src/backend/commands/tablecmds.c
- src/backend/utils/ri_triggers.c

*I am yet to identify the files concerned with the GIN opclass. <-- if
anyone can help with this*

- read a little about op classes
- https://www.postgresql.org/docs/9.5/static/indexes-opclass.html
- explored the existing op classes in Postgres

*Next Step*
I still don't have a solid grasp of how I am going to approach creating an
operator, so I would like to experiment till the next report on creating a
very simple operator.

*I have attached the original proposal here.*

[1]: /messages/by-id/1343842863.5162.4.camel@greygoo.devise-it.lan
/messages/by-id/1343842863.5162.4.camel@greygoo.devise-it.lan
[2]: /messages/by-id/28389.1351094795@sss.pgh.pa.us

Best Regards,
Mark Rofail

Robert Haas

robertmhaas@gmail.com

about 9 years ago

In reply to: Mark Rofail (#1)

Re: GSoC 2017: Foreign Key Arrays

On Mon, May 22, 2017 at 7:51 PM, Mark Rofail <markm.rofail@gmail.com> wrote:

Cloned the git repo found @ https://github.com/postgres/postgres and
identified the main two files I will be concerned with. (I know I may need
to edit other files but these seem to where I will spend most of my summer)

src/backend/commands/tablecmds.c
src/backend/utils/ri_triggers.c

I am yet to identify the files concerned with the GIN opclass. <-- if anyone
can help with this

There's not only one GIN opclass. You can get a list like this:

select oid, * from pg_opclass where opcmethod = 2742;

Actually, you probably want to look for GIN opfamilies:

rhaas=# select oid, * from pg_opfamily where opfmethod = 2742;
oid | opfmethod | opfname | opfnamespace | opfowner
------+-----------+----------------+--------------+----------
2745 | 2742 | array_ops | 11 | 10
3659 | 2742 | tsvector_ops | 11 | 10
4036 | 2742 | jsonb_ops | 11 | 10
4037 | 2742 | jsonb_path_ops | 11 | 10
(4 rows)

To see which SQL functions are used to implement a particular
opfamily, use the OID from the previous step in a query like this:

rhaas=# select prosrc from pg_amop, pg_operator, pg_proc where
amopfamily = 2745 and amopopr = pg_operator.oid and oprcode =
pg_proc.oid;
prosrc
----------------
array_eq
arrayoverlap
arraycontains
arraycontained
(4 rows)

Then, you can look for those in the source tree. You can also search
for the associated support functions, e.g.:

You might want to read https://www.postgresql.org/docs/devel/static/xindex.html

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

GSoC 2017: Foreign Key Arrays

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: