SQL/PGQ: Support multi-pattern path matching in GRAPH_TABLE

Started by Henson Choi6 days ago1 messageshackers
Jump to latest
#1Henson Choi
assam258@gmail.com

Hi hackers,

Now that the SQL/PGQ core has been committed, I'd like to propose
extending GRAPH_TABLE to accept multiple path patterns in the MATCH
clause, i.e. the comma-separated form:

SELECT ... FROM GRAPH_TABLE (g
MATCH <path_pattern_1>, <path_pattern_2>, ...
COLUMNS (...)
);

This shape is not supported today — the parser rejects it with

multiple path patterns in one GRAPH_TABLE clause not supported

and the rewriter asserts that the path_pattern_list has exactly one
entry. Among the features that are not yet covered, I think this one
has the highest practical need: many realistic graph queries express
joins and star-shaped traversals most naturally as multiple
comma-separated patterns, and without it those queries have to be
rewritten into more awkward forms. The attached patch lifts the
restriction and wires the existing path-rewriting pipeline through
the list of path patterns.

What the patch does
-------------------

Parser side: the error that rejected multi-pattern MATCH is
removed, so such queries now reach the rewriter.

Rewriter side: each path pattern is processed as its own chain,
so adjacency linking never crosses a path boundary. Element
variables that share a name across paths are still merged into
the same element — shared variables produce joins, disconnected
paths produce cross products.

An earlier version flattened all element patterns into a single
list, but that treated elements from adjacent paths as adjacent
within one path and broke on vertex-vertex boundaries. The
per-path approach is the minimal fix; a more principled cross-path
join construction is left for this thread to settle.

Examples
--------

Shared variable (join):

MATCH (a IS vl1)-[e1 IS el1]->(b IS vl2),
(b)-[e2 IS el2]->(c IS vl3)

-- b is shared -> the two patterns are joined on b.

Star/hub:

MATCH (a IS vl1)-[]->(b IS vl2),
(a)-[]->(c IS vl3),
(d IS vl2)-[]->(a)

-- three patterns meeting at a.

Disconnected patterns (cross product):

MATCH (a IS vl1), (b IS vl3)

Partial connection mixed with a disconnected piece:

MATCH (a)-[]->(b), (b)-[]->(c), (d IS vl1)

Status
------

I would appreciate feedback along these axes:

* standard conformance — whether the shape and the handling of
multi-pattern MATCH are aligned with SQL/PGQ (ISO/IEC 9075-16);
* semantics — whether the behavior on shared variables,
disconnected patterns, and their combinations is the right one;
* functionality — coverage gaps, cases the patch does not yet
handle, or constructs that should be rejected but currently are
not (and vice versa);
* robustness — correctness under edge cases, error handling, and
anything that could destabilize the existing GRAPH_TABLE path;
* code shape — whether generate_queries_for_path_pattern() has
grown large enough that the per-path body should be factored
out into a helper. I kept the function intact in this round,
but would gladly split it if reviewers prefer that.

Review comments, objections, and alternative approaches are all
welcome — please don't hesitate to push back on anything that looks
off.

Thanks,
Henson

Reference to the SQL/PGQ main thread:

/messages/by-id/CAAAe_zAEEAb=piH4n-mZUhqcL=oKbDv4v-_7C_7KyXroem=HUg@mail.gmail.com

Attachments:

v1-0001-Multi-pattern-path-matching.patchapplication/octet-stream; name=v1-0001-Multi-pattern-path-matching.patchDownload+592-154
v1-0001-Multi-pattern-path-matching.patchapplication/octet-stream; name=v1-0001-Multi-pattern-path-matching.patchDownload+592-154