Joins on TID

Started by Tom Laneover 7 years ago8 messageshackers

tgl@sss.pgh.pa.us

over 7 years ago

I decided to spend an afternoon seeing exactly how much work would be
needed to support parameterized TID scans, ie nestloop-with-inner-TID-
scan joins, as has been speculated about before, most recently here:

/messages/by-id/CAMqTPq=hNg0GYFU0X+xmuKy8R2ARk1+A_uQpS+Mnf71MYpBKzg@mail.gmail.com

It turns out it's not that bad, less than 200 net new lines of code
(all of it in the planner; the executor seems to require no work).

Much of the code churn is because tidpath.c is so ancient and crufty.
It was mostly ignoring the RestrictInfo infrastructure, in particular
emitting the list of tidquals as just bare clauses not RestrictInfos.
I had to change that in order to avoid inefficiencies in some places.

I haven't really looked at how much of a merge problem there'll be
with Edmund Horner's work for TID range scans. My feeling about it
is that we might be best off treating that as a totally separate
code path, because the requirements are significantly different (for
instance, a range scan needs AND semantics not OR semantics for the
list of quals to apply).

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Tom Lane (#1)

Re: Joins on TID

BTW, if we're to start taking joins on TID seriously, we should also
add the missing hash opclass for TID, so that you can do hash joins
when dealing with a lot of rows.

(In principle this also enables things like hash aggregation, though
I'm not very clear on a use-case for grouping by TID.)

regards, tom lane

Simon Riggs

simon@2ndQuadrant.com

over 7 years ago

In reply to: Tom Lane (#2)

Re: Joins on TID

On Sat, 22 Dec 2018 at 04:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

BTW, if we're to start taking joins on TID seriously, we should also
add the missing hash opclass for TID, so that you can do hash joins
when dealing with a lot of rows.

(In principle this also enables things like hash aggregation, though
I'm not very clear on a use-case for grouping by TID.)

I don't think we are trying to do TID joins more seriously, just fix a
special case.

The case cited requires the batches of work to be small, so nested loops
works fine.

Looks to me that Edmund is trying to solve the same problem. If so, this is
the best solution.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Simon Riggs (#3)

Re: Joins on TID

Simon Riggs <simon@2ndquadrant.com> writes:

On Sat, 22 Dec 2018 at 04:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

BTW, if we're to start taking joins on TID seriously, we should also
add the missing hash opclass for TID, so that you can do hash joins
when dealing with a lot of rows.

I don't think we are trying to do TID joins more seriously, just fix a
special case.
The case cited requires the batches of work to be small, so nested loops
works fine.
Looks to me that Edmund is trying to solve the same problem. If so, this is
the best solution.

No, I think what Edmund is on about is unrelated, except that it touches
some of the same code. He's interested in problems like "find the last
few tuples in this table". You can solve that today, with e.g.
"SELECT ... WHERE ctid >= '(n,1)'", but you get a stupidly inefficient
plan. If we think that's a use-case worth supporting then it'd be
reasonable to provide less inefficient implementation(s).

What I'm thinking about in this thread is joins on TID, which we have only
very weak support for today --- you'll basically always wind up with a
mergejoin, which requires full-table scan and sort of its inputs. Still,
that's better than a naive nestloop, and for years we've been figuring
that that was good enough. Several people in the other thread that
I cited felt that that isn't good enough. But if we think it's worth
taking seriously, then IMO we need to add both parameterized scans (for
nestloop-with-inner-fetch-by-tid) and hash join, because each of those
can dominate depending on how many tuples you're joining.

regards, tom lane

Simon Riggs

simon@2ndQuadrant.com

over 7 years ago

In reply to: Tom Lane (#4)

Re: Joins on TID

On Sat, 22 Dec 2018 at 16:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What I'm thinking about in this thread is joins on TID, which we have only
very weak support for today --- you'll basically always wind up with a
mergejoin, which requires full-table scan and sort of its inputs. Still,
that's better than a naive nestloop, and for years we've been figuring
that that was good enough. Several people in the other thread that
I cited felt that that isn't good enough. But if we think it's worth
taking seriously, then IMO we need to add both parameterized scans (for
nestloop-with-inner-fetch-by-tid) and hash join, because each of those
can dominate depending on how many tuples you're joining.

That would certainly help if you are building a column store, or other new
index types.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Darafei "Komяpa" Praliaskouski

me@komzpa.net

over 7 years ago

In reply to: Tom Lane (#4)

Re: Joins on TID

Hi,

Writing as someone who used TID joins and group by's in the past.

One use case is having a chance to peek into what will DELETE do.
A lot of GIS tables don't have any notion of ID, and dirty datasets tend to
have many duplicates you need to cross-reference with something else. So,
you write your query in form of

CREATE TABLE ttt as (SELECT distinct on (ctid) ctid as ct, field1, field2,
b.field3, ... from table b join othertable b on ST_Whatever(a.geom,
b.geom));

DELETE FROM table a USING ttt b where a.ctid = b.ct;
DROP TABLE ttt;

Here:
- distinct on ctid is used (hash?)
- a.ctid = b.ct (hash join candidate?)

I know it's all better with proper IDs, but sometimes it works like that,
usually just once per dataset.

сб, 22 дек. 2018 г. в 19:31, Tom Lane <tgl@sss.pgh.pa.us>:

Simon Riggs <simon@2ndquadrant.com> writes:

On Sat, 22 Dec 2018 at 04:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

BTW, if we're to start taking joins on TID seriously, we should also
add the missing hash opclass for TID, so that you can do hash joins
when dealing with a lot of rows.

I don't think we are trying to do TID joins more seriously, just fix a
special case.
The case cited requires the batches of work to be small, so nested loops
works fine.
Looks to me that Edmund is trying to solve the same problem. If so, this

is

the best solution.

No, I think what Edmund is on about is unrelated, except that it touches
some of the same code. He's interested in problems like "find the last
few tuples in this table". You can solve that today, with e.g.
"SELECT ... WHERE ctid >= '(n,1)'", but you get a stupidly inefficient
plan. If we think that's a use-case worth supporting then it'd be
reasonable to provide less inefficient implementation(s).

What I'm thinking about in this thread is joins on TID, which we have only
very weak support for today --- you'll basically always wind up with a
mergejoin, which requires full-table scan and sort of its inputs. Still,
that's better than a naive nestloop, and for years we've been figuring
that that was good enough. Several people in the other thread that
I cited felt that that isn't good enough. But if we think it's worth
taking seriously, then IMO we need to add both parameterized scans (for
nestloop-with-inner-fetch-by-tid) and hash join, because each of those
can dominate depending on how many tuples you're joining.

regards, tom lane

--

Darafei Praliaskouski
Support me: http://patreon.com/komzpa

Edmund Horner

ejrh00@gmail.com

over 7 years ago

In reply to: Tom Lane (#1)

Re: Joins on TID

On Sat, 22 Dec 2018 at 12:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I decided to spend an afternoon seeing exactly how much work would be
needed to support parameterized TID scans, ie nestloop-with-inner-TID-
scan joins, as has been speculated about before, most recently here:

/messages/by-id/CAMqTPq=hNg0GYFU0X+xmuKy8R2ARk1+A_uQpS+Mnf71MYpBKzg@mail.gmail.com

It turns out it's not that bad, less than 200 net new lines of code
(all of it in the planner; the executor seems to require no work).

Much of the code churn is because tidpath.c is so ancient and crufty.
It was mostly ignoring the RestrictInfo infrastructure, in particular
emitting the list of tidquals as just bare clauses not RestrictInfos.
I had to change that in order to avoid inefficiencies in some places.

It seems good, and I can see you've committed it now. (I should have
commented sooner, but it's the big summer holiday period here, which
means I have plenty of time to work on PostgreSQL, but none of my
usual resources. In any case, I was going to say "this looks useful
and not too complicated, please go ahead".)

I did notice that multiple tidquals are no longer removed from scan_clauses:

EXPLAIN SELECT * FROM pg_class WHERE ctid = '(1,1)' OR ctid = '(2,2)';

Tid Scan on pg_class (cost=0.01..8.03 rows=2 width=265)
TID Cond: ((ctid = '(1,1)'::tid) OR (ctid = '(2,2)'::tid))
Filter: ((ctid = '(1,1)'::tid) OR (ctid = '(2,2)'::tid))

I guess if we thought it was a big deal we could attempt to recreate
the old logic with RestrictInfos.

I haven't really looked at how much of a merge problem there'll be
with Edmund Horner's work for TID range scans. My feeling about it
is that we might be best off treating that as a totally separate
code path, because the requirements are significantly different (for
instance, a range scan needs AND semantics not OR semantics for the
list of quals to apply).

Well, I guess it's up to me to merge it. I can't quite see which
parts we'd use a separate code path for. Can you elaborate?

Edmund

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Edmund Horner (#7)

Re: Joins on TID

Edmund Horner <ejrh00@gmail.com> writes:

On Sat, 22 Dec 2018 at 12:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I decided to spend an afternoon seeing exactly how much work would be
needed to support parameterized TID scans, ie nestloop-with-inner-TID-
scan joins, as has been speculated about before, most recently here:
...

It seems good, and I can see you've committed it now. (I should have
commented sooner, but it's the big summer holiday period here, which
means I have plenty of time to work on PostgreSQL, but none of my
usual resources. In any case, I was going to say "this looks useful
and not too complicated, please go ahead".)

OK.

I did notice that multiple tidquals are no longer removed from scan_clauses:
EXPLAIN SELECT * FROM pg_class WHERE ctid = '(1,1)' OR ctid = '(2,2)';
Tid Scan on pg_class (cost=0.01..8.03 rows=2 width=265)
TID Cond: ((ctid = '(1,1)'::tid) OR (ctid = '(2,2)'::tid))
Filter: ((ctid = '(1,1)'::tid) OR (ctid = '(2,2)'::tid))

I fixed that in the committed version, I believe. (I'd been
overoptimistic about whether logic could be removed from
create_tidscan_plan.)

I haven't really looked at how much of a merge problem there'll be
with Edmund Horner's work for TID range scans. My feeling about it
is that we might be best off treating that as a totally separate
code path, because the requirements are significantly different (for
instance, a range scan needs AND semantics not OR semantics for the
list of quals to apply).

Well, I guess it's up to me to merge it. I can't quite see which
parts we'd use a separate code path for. Can you elaborate?

The thing that's bothering me is something I hadn't really focused on
before, but which looms large now that I've thought about it: the
TID-quals list of a TidPath or TidScan has OR semantics, viz it can
directly represent

ctid = this OR ctid = that OR ctid = the_other

as a list of tideq OpExprs. But what you want for a range scan on
TID is implicit-AND, because you might have either a one-sided
condition, say

ctid >= this

or a range condition

ctid >= this AND ctid <= that

I see that what you've done to make this sort-of work in the existing
patch is to insist that a range scan have just one member at the OR-d list
level and that has to be an AND'ed sublist, but TBH I think that's a mess;
for instance I wonder whether the code works correctly if faced with cases
like

ctid >= this OR ctid <= that

I don't think it's at all practical to have tidpath.c dealing with both
cases in one scan of the quals --- even if you can make it work, it'll be
unreasonably complicated and hard to understand. I'd be inclined to just
have it thumb through the restrictinfo or joininfo list a second time,
looking for inequalities, and build a path for that case separately.

I suspect that on the whole, you'd be better off treating the range-scan
case as completely separate, with a different Path type and different
Plan type too (ie, separate executor support). Yes, this would involve
some duplication of support code, but I think the end result would be
a lot cleaner and easier to understand.

regards, tom lane

Joins on TID

Attachments:

Attachments: