Performance improvement for joins where outer side is unique

Started by David Rowleyover 11 years ago131 messageshackers

dgrowleyml@gmail.com

over 11 years ago

Hi,

I've been hacking a bit at the join code again... This time I've been
putting some effort into optimising the case where the inner side of the
join is known to be unique.
For example, given the tables:

create table t1 (id int primary key);
create table t2 (id int primary key);

And query such as:

select * from t1 left outer join t2 on t1.id=t2.id;

It is possible to deduce at planning time that "for each row in the outer
relation, only 0 or 1 rows can exist in the inner relation", (inner being
t2)

In all of our join algorithms in the executor, if the join type is SEMI,
we skip to the next outer row once we find a matching inner row. This is
because we don't want to allow duplicate rows in the inner side to
duplicate outer rows in the result set. Obviously this is required per SQL
spec. I believe we can also skip to the next outer row in this case when
we've managed to prove that no other row can possibly exist that matches
the current outer row, due to a unique index or group by/distinct clause
(for subqueries).

I've attached a patch which aims to prove this idea works. Although so far
I've only gotten around to implementing this for outer joins.

Since code already existed in analyzejoin.c which aimed to determine if a
join to a relation was unique on the join's condition, the patch is pretty
much just some extra plumbing and a small rewire of analyzejoin.c, which
just aims to get the "unique_inner" bool value down to the join node.

The performance improvement is somewhere around 5-10% for hash joins, and a
bit less for merge join. In theory it could be 50% for nested loop joins.
In my life in the OLTP world, these "unique joins" pretty much cover all
joins that ever happen ever. Perhaps the OLAP world is different, so from
my point of view this is going to be a very nice performance gain.

I'm seeing this patch (or a more complete version) as the first step to a
whole number of other planner improvements:

A couple of examples of those are:

1.
explain select * from sales s inner join product p on p.productid =
s.productid order by s.productid,p.name;

The current plan for this looks like:
QUERY PLAN
--------------------------------------------------------------------------------
Sort (cost=149.80..152.80 rows=1200 width=46)
Sort Key: s.productid, p.name
-> Hash Join (cost=37.00..88.42 rows=1200 width=46)
Hash Cond: (s.productid = p.productid)
-> Seq Scan on sales s (cost=0.00..31.40 rows=2140 width=8)
-> Hash (cost=22.00..22.00 rows=1200 width=38)
-> Seq Scan on product p (cost=0.00..22.00 rows=1200
width=38)

But in reality we could have executed this with a simple merge join using
the PK of product (productid) to provide presorted input. The extra sort
column on p.name is redundant due to productid being unique.

The UNION planning is crying out for help for cases like this. Where it
could provide sorted input on a unique index, providing the union's
targetlist contained all of the unique index's columns, and we also managed
to find an index on the other part of the union on the same set of columns,
then we could perform a Merge Append and a Unique. This would cause a
signification improvement in execution time for these types of queries, as
the current planner does an append/sort/unique, which especially sucks for
plans with a LIMIT clause.

I think this should solve some of the problems that Kyotarosan encountered
in his episode of dancing with indices over here ->
/messages/by-id/20131031.194310.212107585.horiguchi.kyotaro@lab.ntt.co.jp
where he was unable to prove that he could trim down sort nodes once all of
the columns of a unique index had been seen in the order by due to not
knowing if joins were going to duplicate the outer rows.

2.
It should also be possible to reuse a join in situations like:
create view product_sales as select p.productid,sum(s.qty) soldqty from
product p inner join sales s group by p.productid;

Where the consumer does:

select ps.*,p.current_price from product_sales ps inner join product p on
ps.productid = p.productid;

Here we'd currently join the product table twice, both times on the same
condition, but we could safely not bother doing that if we knew that the
join could never match more than 1 row on the inner side. Unfortunately I
deal with horrid situations like this daily, where people have used views
from within views, and all the same tables end up getting joined multiple
times :-(

Of course, both 1 and 2 should be considered separately from the attached
patch, this was just to show where it might lead.

Performance of the patch are as follows:

Test case:
create table t1 (id int primary key);
create table t2 (id int primary key);

insert into t1 select x.x from generate_series(1,1000000) x(x);
insert into t2 select x.x from generate_series(1,1000000) x(x);

vacuum analyze;

Query:
select count(t2.id) from t1 left outer join t2 on t1.id=t2.id;

Values are in transactions per second.

JOIN type Patched Unpatched new runtime
hash 1.295764 1.226188 94.63%
merge 1.786248 1.776605 99.46%
nestloop 0.465356 0.443015 95.20%

Things improve a bit more when using a varchar instead of an int:

hash 1.198821 1.102183 91.94%
I've attached the full benchmark results so as not to spam the thread.

Comments are welcome

Regards

David Rowley

dgrowleyml@gmail.com

over 11 years ago

In reply to: David Rowley (#1)

Re: Performance improvement for joins where outer side is unique

On 1 January 2015 at 02:47, David Rowley <dgrowleyml@gmail.com> wrote:

Hi,

I've been hacking a bit at the join code again... This time I've been
putting some effort into optimising the case where the inner side of the
join is known to be unique.
For example, given the tables:

create table t1 (id int primary key);
create table t2 (id int primary key);

And query such as:

select * from t1 left outer join t2 on t1.id=t2.id;

It is possible to deduce at planning time that "for each row in the outer
relation, only 0 or 1 rows can exist in the inner relation", (inner being
t2)

I've been hacking at this unique join idea again and I've now got it
working for all join types -- Patch attached.

Here's how the performance is looking:

postgres=# create table t1 (id int primary key);
CREATE TABLE
postgres=# create table t2 (id int primary key);
CREATE TABLE
postgres=# insert into t1 select x.x from generate_series(1,1000000) x(x);
INSERT 0 1000000
postgres=# insert into t2 select x.x from generate_series(1,1000000) x(x);
INSERT 0 1000000
postgres=# vacuum analyze;
VACUUM
postgres=# \q

Query: select count(t1.id) from t1 inner join t2 on t1.id=t2.id;

With Patch on master as of 32bf6ee

D:\Postgres\install\bin>pgbench -f d:\unijoin3.sql -T 60 -n postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 78
latency average: 769.231 ms
tps = 1.288260 (including connections establishing)
tps = 1.288635 (excluding connections establishing)

Master as of 32bf6ee

D:\Postgres\install\bin>pgbench -f d:\unijoin3.sql -T 60 -n postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 70
latency average: 857.143 ms
tps = 1.158905 (including connections establishing)
tps = 1.159264 (excluding connections establishing)

That's a 10% performance increase.

I still need to perform more thorough benchmarking with different data
types.

One weird thing that I noticed before is that in an earlier revision of the
patch in the executor's join Initialise node code, I had set the
unique_inner to true for semi joins and replaced the SEMI_JOIN check for a
unique_join check in the execute node for each join method. With this the
performance results barely changed from standard... I've yet to find out
why.

The patch also has added a property to the EXPLAIN (VERBOSE) output which
states if the join was found to be unique or not.

The patch also still requires a final pass of comment fix-ups. I've just
plain run out of time for now.

I'll pick this up in 2 weeks time.

Regards

David Rowley

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 11 years ago

In reply to: David Rowley (#2)

Re: Performance improvement for joins where outer side is unique

Hi, I had a look on this patch. Although I haven't understood
whole the stuff and all of the related things, I will comment as
possible.

Performance:

I looked on the performance gain this patch gives. For several
on-memory joins, I had gains about 3% for merge join, 5% for hash
join, and 10% for nestloop (@CentOS6), for simple 1-level joins
with aggregation similar to what you mentioned in previous
mail like this.

=# SELECT count(*) FROM t1 JOIN t2 USING (id);

Of course, the gain would be trivial when returning many tuples,
or scans go to disk.

I haven't measured the loss by additional computation when the
query is regarded as not a "single join".

Explain representation:

I don't like that the 'Unique Join:" occupies their own lines in
the result of explain, moreover, it doesn't show the meaning
clearly. What about the representation like the following? Or,
It might not be necessary to appear there.

   Nested Loop ...
     Output: ....
 -   Unique Jion: Yes
       -> Seq Scan on public.t2 (cost = ...
 -     -> Seq Scan on public.t1 (cost = ....
 +     -> Seq Scan on public.t1 (unique) (cost = ....

Coding:

The style looks OK. Could applied on master.
It looks to work as you expected for some cases.

Random comments follow.

- Looking specialjoin_is_unique_join, you seem to assume that
!delay_upper_joins is a sufficient condition for not being
"unique join". The former simplly indicates that "don't
commute with upper OJs" and the latter seems to indicate that
"the RHS is unique", Could you tell me what kind of
relationship is there between them?

- The naming "unique join" seems a bit obscure for me, but I
don't have no better idea:( However, the member name
"has_unique_join" seems to be better to be "is_unique_join".

- eclassjoin_is_unique_join involves seemingly semi-exhaustive
scan on ec members for every element in joinlist. Even if not,
it might be thought to be a bit too large for the gain. Could
you do the equivelant things by more small code?

regards,

Jan 2015 00:37:19 +1300, David Rowley <dgrowleyml@gmail.com> wrote in <CAApHDvod_uCMoUPovdpXbNkw50O14x3wwKoJmZLxkbBn71zdEg@mail.gmail.com>

On 1 January 2015 at 02:47, David Rowley <dgrowleyml@gmail.com> wrote:

Hi,

I've been hacking a bit at the join code again... This time I've been
putting some effort into optimising the case where the inner side of the
join is known to be unique.
For example, given the tables:

create table t1 (id int primary key);
create table t2 (id int primary key);

And query such as:

select * from t1 left outer join t2 on t1.id=t2.id;

It is possible to deduce at planning time that "for each row in the outer
relation, only 0 or 1 rows can exist in the inner relation", (inner being
t2)

I've been hacking at this unique join idea again and I've now got it
working for all join types -- Patch attached.

Here's how the performance is looking:

postgres=# create table t1 (id int primary key);
CREATE TABLE
postgres=# create table t2 (id int primary key);
CREATE TABLE
postgres=# insert into t1 select x.x from generate_series(1,1000000) x(x);
INSERT 0 1000000
postgres=# insert into t2 select x.x from generate_series(1,1000000) x(x);
INSERT 0 1000000
postgres=# vacuum analyze;
VACUUM
postgres=# \q

Query: select count(t1.id) from t1 inner join t2 on t1.id=t2.id;

With Patch on master as of 32bf6ee

D:\Postgres\install\bin>pgbench -f d:\unijoin3.sql -T 60 -n postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 78
latency average: 769.231 ms
tps = 1.288260 (including connections establishing)
tps = 1.288635 (excluding connections establishing)

Master as of 32bf6ee

D:\Postgres\install\bin>pgbench -f d:\unijoin3.sql -T 60 -n postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 70
latency average: 857.143 ms
tps = 1.158905 (including connections establishing)
tps = 1.159264 (excluding connections establishing)

That's a 10% performance increase.

I still need to perform more thorough benchmarking with different data
types.

One weird thing that I noticed before is that in an earlier revision of the
patch in the executor's join Initialise node code, I had set the
unique_inner to true for semi joins and replaced the SEMI_JOIN check for a
unique_join check in the execute node for each join method. With this the
performance results barely changed from standard... I've yet to find out
why.

I don't know even what you did precisely but if you do such a
thing, you perhaps should change the name of "unique_inner" to
something representing that it reads up to one row from the inner
for one outer row. (Sorry I have no good idea for this..)

The patch also has added a property to the EXPLAIN (VERBOSE) output which
states if the join was found to be unique or not.

The patch also still requires a final pass of comment fix-ups. I've just
plain run out of time for now.

I'll pick this up in 2 weeks time.

Regards

David Rowley

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 11 years ago

In reply to: David Rowley (#2)

Re: Performance improvement for joins where outer side is unique

Hi David,

I've been looking at this patch, mostly because it seems like a great
starting point for improving estimation for joins on multi-column FKs.

Currently we do this:

CREATE TABLE parent (a INT, b INT, PRIMARY KEY (a,b));
CREATE TABLE child (a INT, b INT, FOREIGN KEY (a,b)
REFERENCES parent (a,b));

INSERT INTO parent SELECT i, i FROM generate_series(1,1000000) s(i);
INSERT INTO child SELECT i, i FROM generate_series(1,1000000) s(i);

ANALYZE;

EXPLAIN SELECT * FROM parent JOIN child USING (a,b);

QUERY PLAN
---------------------------------------------------------------------
Hash Join (cost=33332.00..66978.01 rows=1 width=8)
Hash Cond: ((parent.a = child.a) AND (parent.b = child.b))
-> Seq Scan on parent (cost=0.00..14425.00 rows=1000000 width=8)
-> Hash (cost=14425.00..14425.00 rows=1000000 width=8)
-> Seq Scan on child (cost=0.00..14425.00 rows=1000000 width=8)
(5 rows)

Which is of course non-sense, because we know it's a join on FK, so the
join will produce 1M rows (just like the child table).

This seems like a rather natural extension of what you're doing in this
patch, except that it only affects the optimizer and not the executor.
Do you have any plans in this direction? If not, I'll pick this up as I
do have that on my TODO.

regards

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 11 years ago

In reply to: David Rowley (#2)

Re: Performance improvement for joins where outer side is unique

Hi,

I tried to do an initdb with the patch applied, and seems there's a bug
somewhere in analyzejoins.c:

tomas@rimmer ~ $ pg_ctl -D tmp/pg-unidata init
The files belonging to this database system will be owned by user "tomas".
This user must also own the server process.

The database cluster will be initialized with locale "en_US".
The default database encoding has accordingly been set to "LATIN1".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory tmp/pg-unidata ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... sysv
creating configuration files ... ok
creating template1 database in tmp/pg-unidata/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... TRAP:
FailedAssertion("!(index_vars != ((List *) ((void *)0)))", File:
"analyzejoins.c", Line: 414)
sh: line 1: 339 Aborted
"/home/tomas/pg-unijoins/bin/postgres" --single -F -O -c
search_path=pg_catalog -c exit_on_error=true template1 > /dev/null
child process exited with exit code 134
initdb: removing data directory "tmp/pg-unidata"
pg_ctl: database system initialization failed

The problem seems to be the last command in setup_description() at
src/bin/initdb/initdb.c:1843, i.e. this query:

WITH funcdescs AS (
SELECT p.oid as p_oid, oprname,
coalesce(obj_description(o.oid, 'pg_operator'),'') as opdesc
FROM pg_proc p JOIN pg_operator o ON oprcode = p.oid )
INSERT INTO pg_description
SELECT p_oid, 'pg_proc'::regclass, 0,
'implementation of ' || oprname || ' operator'
FROM funcdescs
WHERE opdesc NOT LIKE 'deprecated%' AND
NOT EXISTS (SELECT 1 FROM pg_description
WHERE objoid = p_oid AND classoid = 'pg_proc'::regclass)
And particularly the join in the CTE, i.e. this fails

SELECT * FROM pg_proc p JOIN pg_operator o ON oprcode = p.oid

I'm not quite sure why, but eclassjoin_is_unique_join() never actually
jumps into this part (line ~400):

if (relvar != NULL && candidaterelvar != NULL)
{
...
index_vars = lappend(index_vars, candidaterelvar);
...
}

so the index_vars is NIL. Not sure why, but I'm sure you'll spot the
issue right away.

BTW, I find this coding (first cast, then check) rather strange:

Var *var = (Var *) ecm->em_expr;

if (!IsA(var, Var))
continue; /* Ignore Consts */

It's probably harmless, but I find it confusing and I can't remember
seeing it elsewhere in the code (for example clausesel.c and such) use
this style:

... clause is (Node*) ...

if (IsA(clause, Var))
{
Var *var = (Var*)clause;
...
}

Var * var = NULL;

if (! IsA(clause, Var))
// error / continue

var = (Var*)clause;

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 11 years ago

In reply to: Tomas Vondra (#5)

Re: Performance improvement for joins where outer side is unique

On 26.2.2015 01:15, Tomas Vondra wrote:

I'm not quite sure why, but eclassjoin_is_unique_join() never actually
jumps into this part (line ~400):

if (relvar != NULL && candidaterelvar != NULL)
{
...
index_vars = lappend(index_vars, candidaterelvar);
...
}

so the index_vars is NIL. Not sure why, but I'm sure you'll spot the
issue right away.

FWIW this apparently happens because the code only expect that
EquivalenceMembers only contain Var, but in this particular case that's
not the case - it contains RelabelType (because oprcode is regproc, and
needs to be relabeled to oid).

Adding this before the IsA(var, Var) check fixes the issue

if (IsA(var, RelabelType))
var = (Var*) ((RelabelType *) var)->arg;

but this makes the code even more confusing, because 'var' suggests it's
a Var node, but it's RelabelType node.

Also, specialjoin_is_unique_join() may have the same problem, but I've
been unable to come up with an example.

regards

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Tomas Vondra (#6)

Re: Performance improvement for joins where outer side is unique

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

FWIW this apparently happens because the code only expect that
EquivalenceMembers only contain Var, but in this particular case that's
not the case - it contains RelabelType (because oprcode is regproc, and
needs to be relabeled to oid).

If it thinks an EquivalenceMember must be a Var, it's outright broken;
I'm pretty sure any nonvolatile expression is possible.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 11 years ago

In reply to: Tom Lane (#7)

Re: Performance improvement for joins where outer side is unique

On 26.2.2015 18:34, Tom Lane wrote:

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

FWIW this apparently happens because the code only expect that
EquivalenceMembers only contain Var, but in this particular case that's
not the case - it contains RelabelType (because oprcode is regproc, and
needs to be relabeled to oid).

If it thinks an EquivalenceMember must be a Var, it's outright
broken; I'm pretty sure any nonvolatile expression is possible.

I came to the same conclusion, because even with the RelabelType fix
it's trivial to crash it with a query like this:

SELECT 1 FROM pg_proc p JOIN pg_operator o
ON oprcode = (p.oid::int4 + 1);

I think fixing this (e.g. by restricting the optimization to Var-only
cases) should not be difficult, although it might be nice to handle
generic expressions too, and something like examine_variable() should do
the trick. But I think UNIQUE indexes on expressions are not all that
common.

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Performance improvement for joins where outer side is unique

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: