Performance improvement for joins where outer side is unique
Hi,
I've been hacking a bit at the join code again... This time I've been
putting some effort into optimising the case where the inner side of the
join is known to be unique.
For example, given the tables:
create table t1 (id int primary key);
create table t2 (id int primary key);
And query such as:
select * from t1 left outer join t2 on t1.id=t2.id;
It is possible to deduce at planning time that "for each row in the outer
relation, only 0 or 1 rows can exist in the inner relation", (inner being
t2)
In all of our join algorithms in the executor, if the join type is SEMI,
we skip to the next outer row once we find a matching inner row. This is
because we don't want to allow duplicate rows in the inner side to
duplicate outer rows in the result set. Obviously this is required per SQL
spec. I believe we can also skip to the next outer row in this case when
we've managed to prove that no other row can possibly exist that matches
the current outer row, due to a unique index or group by/distinct clause
(for subqueries).
I've attached a patch which aims to prove this idea works. Although so far
I've only gotten around to implementing this for outer joins.
Since code already existed in analyzejoin.c which aimed to determine if a
join to a relation was unique on the join's condition, the patch is pretty
much just some extra plumbing and a small rewire of analyzejoin.c, which
just aims to get the "unique_inner" bool value down to the join node.
The performance improvement is somewhere around 5-10% for hash joins, and a
bit less for merge join. In theory it could be 50% for nested loop joins.
In my life in the OLTP world, these "unique joins" pretty much cover all
joins that ever happen ever. Perhaps the OLAP world is different, so from
my point of view this is going to be a very nice performance gain.
I'm seeing this patch (or a more complete version) as the first step to a
whole number of other planner improvements:
A couple of examples of those are:
1.
explain select * from sales s inner join product p on p.productid =
s.productid order by s.productid,p.name;
The current plan for this looks like:
QUERY PLAN
--------------------------------------------------------------------------------
Sort (cost=149.80..152.80 rows=1200 width=46)
Sort Key: s.productid, p.name
-> Hash Join (cost=37.00..88.42 rows=1200 width=46)
Hash Cond: (s.productid = p.productid)
-> Seq Scan on sales s (cost=0.00..31.40 rows=2140 width=8)
-> Hash (cost=22.00..22.00 rows=1200 width=38)
-> Seq Scan on product p (cost=0.00..22.00 rows=1200
width=38)
But in reality we could have executed this with a simple merge join using
the PK of product (productid) to provide presorted input. The extra sort
column on p.name is redundant due to productid being unique.
The UNION planning is crying out for help for cases like this. Where it
could provide sorted input on a unique index, providing the union's
targetlist contained all of the unique index's columns, and we also managed
to find an index on the other part of the union on the same set of columns,
then we could perform a Merge Append and a Unique. This would cause a
signification improvement in execution time for these types of queries, as
the current planner does an append/sort/unique, which especially sucks for
plans with a LIMIT clause.
I think this should solve some of the problems that Kyotarosan encountered
in his episode of dancing with indices over here ->
/messages/by-id/20131031.194310.212107585.horiguchi.kyotaro@lab.ntt.co.jp
where he was unable to prove that he could trim down sort nodes once all of
the columns of a unique index had been seen in the order by due to not
knowing if joins were going to duplicate the outer rows.
2.
It should also be possible to reuse a join in situations like:
create view product_sales as select p.productid,sum(s.qty) soldqty from
product p inner join sales s group by p.productid;
Where the consumer does:
select ps.*,p.current_price from product_sales ps inner join product p on
ps.productid = p.productid;
Here we'd currently join the product table twice, both times on the same
condition, but we could safely not bother doing that if we knew that the
join could never match more than 1 row on the inner side. Unfortunately I
deal with horrid situations like this daily, where people have used views
from within views, and all the same tables end up getting joined multiple
times :-(
Of course, both 1 and 2 should be considered separately from the attached
patch, this was just to show where it might lead.
Performance of the patch are as follows:
Test case:
create table t1 (id int primary key);
create table t2 (id int primary key);
insert into t1 select x.x from generate_series(1,1000000) x(x);
insert into t2 select x.x from generate_series(1,1000000) x(x);
vacuum analyze;
Query:
select count(t2.id) from t1 left outer join t2 on t1.id=t2.id;
Values are in transactions per second.
JOIN type Patched Unpatched new runtime
hash 1.295764 1.226188 94.63%
merge 1.786248 1.776605 99.46%
nestloop 0.465356 0.443015 95.20%
Things improve a bit more when using a varchar instead of an int:
hash 1.198821 1.102183 91.94%
I've attached the full benchmark results so as not to spam the thread.
Comments are welcome
Regards
David Rowley
On 1 January 2015 at 02:47, David Rowley <dgrowleyml@gmail.com> wrote:
Hi,
I've been hacking a bit at the join code again... This time I've been
putting some effort into optimising the case where the inner side of the
join is known to be unique.
For example, given the tables:create table t1 (id int primary key);
create table t2 (id int primary key);And query such as:
select * from t1 left outer join t2 on t1.id=t2.id;
It is possible to deduce at planning time that "for each row in the outer
relation, only 0 or 1 rows can exist in the inner relation", (inner being
t2)
I've been hacking at this unique join idea again and I've now got it
working for all join types -- Patch attached.
Here's how the performance is looking:
postgres=# create table t1 (id int primary key);
CREATE TABLE
postgres=# create table t2 (id int primary key);
CREATE TABLE
postgres=# insert into t1 select x.x from generate_series(1,1000000) x(x);
INSERT 0 1000000
postgres=# insert into t2 select x.x from generate_series(1,1000000) x(x);
INSERT 0 1000000
postgres=# vacuum analyze;
VACUUM
postgres=# \q
Query: select count(t1.id) from t1 inner join t2 on t1.id=t2.id;
With Patch on master as of 32bf6ee
D:\Postgres\install\bin>pgbench -f d:\unijoin3.sql -T 60 -n postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 78
latency average: 769.231 ms
tps = 1.288260 (including connections establishing)
tps = 1.288635 (excluding connections establishing)
Master as of 32bf6ee
D:\Postgres\install\bin>pgbench -f d:\unijoin3.sql -T 60 -n postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 70
latency average: 857.143 ms
tps = 1.158905 (including connections establishing)
tps = 1.159264 (excluding connections establishing)
That's a 10% performance increase.
I still need to perform more thorough benchmarking with different data
types.
One weird thing that I noticed before is that in an earlier revision of the
patch in the executor's join Initialise node code, I had set the
unique_inner to true for semi joins and replaced the SEMI_JOIN check for a
unique_join check in the execute node for each join method. With this the
performance results barely changed from standard... I've yet to find out
why.
The patch also has added a property to the EXPLAIN (VERBOSE) output which
states if the join was found to be unique or not.
The patch also still requires a final pass of comment fix-ups. I've just
plain run out of time for now.
I'll pick this up in 2 weeks time.
Regards
David Rowley
Attachments:
unijoin_2015-01-31_9af9cfa.patchapplication/octet-stream; name=unijoin_2015-01-31_9af9cfa.patchDownload+800-84
Hi, I had a look on this patch. Although I haven't understood
whole the stuff and all of the related things, I will comment as
possible.
Performance:
I looked on the performance gain this patch gives. For several
on-memory joins, I had gains about 3% for merge join, 5% for hash
join, and 10% for nestloop (@CentOS6), for simple 1-level joins
with aggregation similar to what you mentioned in previous
mail like this.
=# SELECT count(*) FROM t1 JOIN t2 USING (id);
Of course, the gain would be trivial when returning many tuples,
or scans go to disk.
I haven't measured the loss by additional computation when the
query is regarded as not a "single join".
Explain representation:
I don't like that the 'Unique Join:" occupies their own lines in
the result of explain, moreover, it doesn't show the meaning
clearly. What about the representation like the following? Or,
It might not be necessary to appear there.
Nested Loop ...
Output: ....
- Unique Jion: Yes
-> Seq Scan on public.t2 (cost = ...
- -> Seq Scan on public.t1 (cost = ....
+ -> Seq Scan on public.t1 (unique) (cost = ....
Coding:
The style looks OK. Could applied on master.
It looks to work as you expected for some cases.
Random comments follow.
- Looking specialjoin_is_unique_join, you seem to assume that
!delay_upper_joins is a sufficient condition for not being
"unique join". The former simplly indicates that "don't
commute with upper OJs" and the latter seems to indicate that
"the RHS is unique", Could you tell me what kind of
relationship is there between them?
- The naming "unique join" seems a bit obscure for me, but I
don't have no better idea:( However, the member name
"has_unique_join" seems to be better to be "is_unique_join".
- eclassjoin_is_unique_join involves seemingly semi-exhaustive
scan on ec members for every element in joinlist. Even if not,
it might be thought to be a bit too large for the gain. Could
you do the equivelant things by more small code?
regards,
Jan 2015 00:37:19 +1300, David Rowley <dgrowleyml@gmail.com> wrote in <CAApHDvod_uCMoUPovdpXbNkw50O14x3wwKoJmZLxkbBn71zdEg@mail.gmail.com>
On 1 January 2015 at 02:47, David Rowley <dgrowleyml@gmail.com> wrote:
Hi,
I've been hacking a bit at the join code again... This time I've been
putting some effort into optimising the case where the inner side of the
join is known to be unique.
For example, given the tables:create table t1 (id int primary key);
create table t2 (id int primary key);And query such as:
select * from t1 left outer join t2 on t1.id=t2.id;
It is possible to deduce at planning time that "for each row in the outer
relation, only 0 or 1 rows can exist in the inner relation", (inner being
t2)I've been hacking at this unique join idea again and I've now got it
working for all join types -- Patch attached.Here's how the performance is looking:
postgres=# create table t1 (id int primary key);
CREATE TABLE
postgres=# create table t2 (id int primary key);
CREATE TABLE
postgres=# insert into t1 select x.x from generate_series(1,1000000) x(x);
INSERT 0 1000000
postgres=# insert into t2 select x.x from generate_series(1,1000000) x(x);
INSERT 0 1000000
postgres=# vacuum analyze;
VACUUM
postgres=# \qQuery: select count(t1.id) from t1 inner join t2 on t1.id=t2.id;
With Patch on master as of 32bf6ee
D:\Postgres\install\bin>pgbench -f d:\unijoin3.sql -T 60 -n postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 78
latency average: 769.231 ms
tps = 1.288260 (including connections establishing)
tps = 1.288635 (excluding connections establishing)Master as of 32bf6ee
D:\Postgres\install\bin>pgbench -f d:\unijoin3.sql -T 60 -n postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 70
latency average: 857.143 ms
tps = 1.158905 (including connections establishing)
tps = 1.159264 (excluding connections establishing)That's a 10% performance increase.
I still need to perform more thorough benchmarking with different data
types.One weird thing that I noticed before is that in an earlier revision of the
patch in the executor's join Initialise node code, I had set the
unique_inner to true for semi joins and replaced the SEMI_JOIN check for a
unique_join check in the execute node for each join method. With this the
performance results barely changed from standard... I've yet to find out
why.
I don't know even what you did precisely but if you do such a
thing, you perhaps should change the name of "unique_inner" to
something representing that it reads up to one row from the inner
for one outer row. (Sorry I have no good idea for this..)
The patch also has added a property to the EXPLAIN (VERBOSE) output which
states if the join was found to be unique or not.The patch also still requires a final pass of comment fix-ups. I've just
plain run out of time for now.I'll pick this up in 2 weeks time.
Regards
David Rowley
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi David,
I've been looking at this patch, mostly because it seems like a great
starting point for improving estimation for joins on multi-column FKs.
Currently we do this:
CREATE TABLE parent (a INT, b INT, PRIMARY KEY (a,b));
CREATE TABLE child (a INT, b INT, FOREIGN KEY (a,b)
REFERENCES parent (a,b));
INSERT INTO parent SELECT i, i FROM generate_series(1,1000000) s(i);
INSERT INTO child SELECT i, i FROM generate_series(1,1000000) s(i);
ANALYZE;
EXPLAIN SELECT * FROM parent JOIN child USING (a,b);
QUERY PLAN
---------------------------------------------------------------------
Hash Join (cost=33332.00..66978.01 rows=1 width=8)
Hash Cond: ((parent.a = child.a) AND (parent.b = child.b))
-> Seq Scan on parent (cost=0.00..14425.00 rows=1000000 width=8)
-> Hash (cost=14425.00..14425.00 rows=1000000 width=8)
-> Seq Scan on child (cost=0.00..14425.00 rows=1000000 width=8)
(5 rows)
Which is of course non-sense, because we know it's a join on FK, so the
join will produce 1M rows (just like the child table).
This seems like a rather natural extension of what you're doing in this
patch, except that it only affects the optimizer and not the executor.
Do you have any plans in this direction? If not, I'll pick this up as I
do have that on my TODO.
regards
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
I tried to do an initdb with the patch applied, and seems there's a bug
somewhere in analyzejoins.c:
tomas@rimmer ~ $ pg_ctl -D tmp/pg-unidata init
The files belonging to this database system will be owned by user "tomas".
This user must also own the server process.
The database cluster will be initialized with locale "en_US".
The default database encoding has accordingly been set to "LATIN1".
The default text search configuration will be set to "english".
Data page checksums are disabled.
creating directory tmp/pg-unidata ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... sysv
creating configuration files ... ok
creating template1 database in tmp/pg-unidata/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... TRAP:
FailedAssertion("!(index_vars != ((List *) ((void *)0)))", File:
"analyzejoins.c", Line: 414)
sh: line 1: 339 Aborted
"/home/tomas/pg-unijoins/bin/postgres" --single -F -O -c
search_path=pg_catalog -c exit_on_error=true template1 > /dev/null
child process exited with exit code 134
initdb: removing data directory "tmp/pg-unidata"
pg_ctl: database system initialization failed
The problem seems to be the last command in setup_description() at
src/bin/initdb/initdb.c:1843, i.e. this query:
WITH funcdescs AS (
SELECT p.oid as p_oid, oprname,
coalesce(obj_description(o.oid, 'pg_operator'),'') as opdesc
FROM pg_proc p JOIN pg_operator o ON oprcode = p.oid )
INSERT INTO pg_description
SELECT p_oid, 'pg_proc'::regclass, 0,
'implementation of ' || oprname || ' operator'
FROM funcdescs
WHERE opdesc NOT LIKE 'deprecated%' AND
NOT EXISTS (SELECT 1 FROM pg_description
WHERE objoid = p_oid AND classoid = 'pg_proc'::regclass)
And particularly the join in the CTE, i.e. this fails
SELECT * FROM pg_proc p JOIN pg_operator o ON oprcode = p.oid
I'm not quite sure why, but eclassjoin_is_unique_join() never actually
jumps into this part (line ~400):
if (relvar != NULL && candidaterelvar != NULL)
{
...
index_vars = lappend(index_vars, candidaterelvar);
...
}
so the index_vars is NIL. Not sure why, but I'm sure you'll spot the
issue right away.
BTW, I find this coding (first cast, then check) rather strange:
Var *var = (Var *) ecm->em_expr;
if (!IsA(var, Var))
continue; /* Ignore Consts */
It's probably harmless, but I find it confusing and I can't remember
seeing it elsewhere in the code (for example clausesel.c and such) use
this style:
... clause is (Node*) ...
if (IsA(clause, Var))
{
Var *var = (Var*)clause;
...
}
or
Var * var = NULL;
if (! IsA(clause, Var))
// error / continue
var = (Var*)clause;
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 26.2.2015 01:15, Tomas Vondra wrote:
I'm not quite sure why, but eclassjoin_is_unique_join() never actually
jumps into this part (line ~400):if (relvar != NULL && candidaterelvar != NULL)
{
...
index_vars = lappend(index_vars, candidaterelvar);
...
}so the index_vars is NIL. Not sure why, but I'm sure you'll spot the
issue right away.
FWIW this apparently happens because the code only expect that
EquivalenceMembers only contain Var, but in this particular case that's
not the case - it contains RelabelType (because oprcode is regproc, and
needs to be relabeled to oid).
Adding this before the IsA(var, Var) check fixes the issue
if (IsA(var, RelabelType))
var = (Var*) ((RelabelType *) var)->arg;
but this makes the code even more confusing, because 'var' suggests it's
a Var node, but it's RelabelType node.
Also, specialjoin_is_unique_join() may have the same problem, but I've
been unable to come up with an example.
regards
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
FWIW this apparently happens because the code only expect that
EquivalenceMembers only contain Var, but in this particular case that's
not the case - it contains RelabelType (because oprcode is regproc, and
needs to be relabeled to oid).
If it thinks an EquivalenceMember must be a Var, it's outright broken;
I'm pretty sure any nonvolatile expression is possible.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 26.2.2015 18:34, Tom Lane wrote:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
FWIW this apparently happens because the code only expect that
EquivalenceMembers only contain Var, but in this particular case that's
not the case - it contains RelabelType (because oprcode is regproc, and
needs to be relabeled to oid).If it thinks an EquivalenceMember must be a Var, it's outright
broken; I'm pretty sure any nonvolatile expression is possible.
I came to the same conclusion, because even with the RelabelType fix
it's trivial to crash it with a query like this:
SELECT 1 FROM pg_proc p JOIN pg_operator o
ON oprcode = (p.oid::int4 + 1);
I think fixing this (e.g. by restricting the optimization to Var-only
cases) should not be difficult, although it might be nice to handle
generic expressions too, and something like examine_variable() should do
the trick. But I think UNIQUE indexes on expressions are not all that
common.
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 3 February 2015 at 22:23, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hi, I had a look on this patch. Although I haven't understood
whole the stuff and all of the related things, I will comment as
possible.
Great, thank you for taking the time to look and review the patch.
Performance:
I looked on the performance gain this patch gives. For several
on-memory joins, I had gains about 3% for merge join, 5% for hash
join, and 10% for nestloop (@CentOS6), for simple 1-level joins
with aggregation similar to what you mentioned in previous
mail like this.=# SELECT count(*) FROM t1 JOIN t2 USING (id);
Of course, the gain would be trivial when returning many tuples,
or scans go to disk.
That's true, but joins where rows are filtered after the join condition
will win here too:
For example select * from t1 inner join t2 on t1.id=t2.id where t1.value >
t2.value;
Also queries with a GROUP BY clause. (Since grouping is always performed
after the join :-( )
I haven't measured the loss by additional computation when the
query is regarded as not a "single join".
I think this would be hard to measure, but likely if it is measurable then
you'd want to look at just planning time, rather than planning and
execution time.
Explain representation:
I don't like that the 'Unique Join:" occupies their own lines in
the result of explain, moreover, it doesn't show the meaning
clearly. What about the representation like the following? Or,
It might not be necessary to appear there.Nested Loop ... Output: .... - Unique Jion: Yes -> Seq Scan on public.t2 (cost = ... - -> Seq Scan on public.t1 (cost = .... + -> Seq Scan on public.t1 (unique) (cost = ....
Yeah I'm not too big a fan of this either. I did at one evolution of the
patch I had "Unique Left Join" in the join node's line in the explain
output, but I hated that more and changed it just to be just in the VERBOSE
output, and after I did that I didn't want to change the join node's line
only when in verbose mode. I do quite like that it's a separate item for
the XML and JSON explain output. That's perhaps quite useful when the
explain output must be processed by software.
I'm totally open for better ideas on names, but I just don't have any at
the moment.
Coding:
The style looks OK. Could applied on master.
It looks to work as you expected for some cases.Random comments follow.
- Looking specialjoin_is_unique_join, you seem to assume that
!delay_upper_joins is a sufficient condition for not being
"unique join". The former simplly indicates that "don't
commute with upper OJs" and the latter seems to indicate that
"the RHS is unique", Could you tell me what kind of
relationship is there between them?
The rationalisation around that are from the (now changed) version of
join_is_removable(), where the code read:
/*
* Must be a non-delaying left join to a single baserel, else we aren't
* going to be able to do anything with it.
*/
if (sjinfo->jointype != JOIN_LEFT ||
sjinfo->delay_upper_joins)
return false;
I have to admit that I didn't go and investigate why delayed upper joins
cannot be removed by left join removal code, I really just assumed that
we're just unable to prove that a join to such a relation won't match more
than one outer side's row. I kept this to maintain that behaviour as I
assumed it was there for a good reason.
- The naming "unique join" seems a bit obscure for me, but I
don't have no better idea:( However, the member name
"has_unique_join" seems to be better to be "is_unique_join".
Yeah, I agree with this. I just can't find something short enough that
means "based on the join condition, the inner side of the join will never
produce more than 1 row for any single outer row". Unique join was the best
I came up with. I'm totally open for better ideas.
I agree that is_unique_join is better than has_unique_join. I must have
just copied the has_eclass_joins struct member without thinking too hard
about it. I've now changed this in my local copy of the patch.
- eclassjoin_is_unique_join involves seemingly semi-exhaustive
scan on ec members for every element in joinlist. Even if not,
it might be thought to be a bit too large for the gain. Could
you do the equivelant things by more small code?
I'd imagine some very complex queries could have many equivalence classes
and members, though I hadn't really thought that this number would be so
great that processing here would suffer much. There's quite a few fast
paths out, like the test to ensure that both rels are mentioned
in ec_relids. Though for a query which is so complex to have a great number
of eclass' and members, likely there would be quite a few relations
involved and planning would be slower anyway. I'm not quite sure how else I
could write this and still find the unique join cases each time. We can't
really just give up half way through looking.
Thanks again for the review.
Regards
David Rowley
On 26 February 2015 at 08:39, Tomas Vondra <tomas.vondra@2ndquadrant.com>
wrote:
I've been looking at this patch, mostly because it seems like a great
starting point for improving estimation for joins on multi-column FKs.Currently we do this:
CREATE TABLE parent (a INT, b INT, PRIMARY KEY (a,b));
CREATE TABLE child (a INT, b INT, FOREIGN KEY (a,b)
REFERENCES parent (a,b));INSERT INTO parent SELECT i, i FROM generate_series(1,1000000) s(i);
INSERT INTO child SELECT i, i FROM generate_series(1,1000000) s(i);ANALYZE;
EXPLAIN SELECT * FROM parent JOIN child USING (a,b);
QUERY PLAN
---------------------------------------------------------------------
Hash Join (cost=33332.00..66978.01 rows=1 width=8)
Hash Cond: ((parent.a = child.a) AND (parent.b = child.b))
-> Seq Scan on parent (cost=0.00..14425.00 rows=1000000 width=8)
-> Hash (cost=14425.00..14425.00 rows=1000000 width=8)
-> Seq Scan on child (cost=0.00..14425.00 rows=1000000 width=8)
(5 rows)Which is of course non-sense, because we know it's a join on FK, so the
join will produce 1M rows (just like the child table).This seems like a rather natural extension of what you're doing in this
patch, except that it only affects the optimizer and not the executor.
Do you have any plans in this direction? If not, I'll pick this up as I
do have that on my TODO.
Hi Tomas,
I guess similar analysis could be done on FKs as I'm doing on unique
indexes. Perhaps my patch for inner join removal can help you more with
that. You may notice that in this patch I have ended up changing the left
join removal code so that it just checks if has_unique_join is set for the
special join. Likely something similar could be done with your idea and the
inner join removals, just by adding some sort of flag on RelOptInfo to say
"join_row_exists" or some better name. Quite likely if there's any pending
foreign key triggers, then it won't matter at all for the sake of row
estimates.
Regards
David Rowley
On 27 February 2015 at 06:48, Tomas Vondra <tomas.vondra@2ndquadrant.com>
wrote:
On 26.2.2015 18:34, Tom Lane wrote:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
FWIW this apparently happens because the code only expect that
EquivalenceMembers only contain Var, but in this particular case that's
not the case - it contains RelabelType (because oprcode is regproc, and
needs to be relabeled to oid).If it thinks an EquivalenceMember must be a Var, it's outright
broken; I'm pretty sure any nonvolatile expression is possible.I came to the same conclusion, because even with the RelabelType fix
it's trivial to crash it with a query like this:SELECT 1 FROM pg_proc p JOIN pg_operator o
ON oprcode = (p.oid::int4 + 1);
Thanks for looking at this Tomas. Sorry it's taken this long for me to
respond, but I wanted to do so with a working patch.
I've made a few changes in the attached version:
1. Fixed Assert failure when eclass contained non-Var types, as reported by
you.
2. Added support for expression indexes.
The expression indexes should really be supported as with the previous
patch they worked ok with LEFT JOINs, but not INNER JOINs, that
inconsistency is pretty much a bug in my book, so I've fixed it.
The one weird quirk with the patch is that, if we had some tables like:
create table r1 (id int primary key, value int not null);
create table r2 (id int primary key);
And a query:
explain verbose select * from r1 inner join r2 on r1.id=r2.id where r2.id
=r1.value;
The join is not properly detected as a unique join. This is down to the
eclass containing 3 members, when the code finds the 2nd ec member for r1
it returns false as it already found another one. I'm not quite sure what
the fix is for this just yet as it's not quite clear to me how the code
would work if there were 2 vars from each relation in the same eclass... If
these Vars were of different types then which operators would we use for
them? I'm not sure if having eclassjoin_is_unique_join() append every
possible combination to index_exprs is the answer. I'm also not quite sure
if the complexity is worth the extra code either.
Updated patch attached.
Thank you for looking it and reporting that bug.
Regards
David Rowley
Attachments:
unijoin_2015-03-04_ac455bd.patchapplication/octet-stream; name=unijoin_2015-03-04_ac455bd.patchDownload+827-84
On 26 February 2015 at 13:15, Tomas Vondra <tomas.vondra@2ndquadrant.com>
wrote:
BTW, I find this coding (first cast, then check) rather strange:
Var *var = (Var *) ecm->em_expr;
if (!IsA(var, Var))
continue; /* Ignore Consts */It's probably harmless, but I find it confusing and I can't remember
seeing it elsewhere in the code (for example clausesel.c and such) use
this style:... clause is (Node*) ...
if (IsA(clause, Var))
{
Var *var = (Var*)clause;
...
}or
Var * var = NULL;
if (! IsA(clause, Var))
// error / continuevar = (Var*)clause;
Yeah, it does look a bit weird, but if you search the code for "IsA(var,
Var)" you'll see it's nothing new.
Regards
David Rowley
Sorry for the dealy, I've returned to this.
Performance:
I looked on the performance gain this patch gives. For several
on-memory joins, I had gains about 3% for merge join, 5% for hash
join, and 10% for nestloop (@CentOS6), for simple 1-level joins
with aggregation similar to what you mentioned in previous
mail like this.=# SELECT count(*) FROM t1 JOIN t2 USING (id);
Of course, the gain would be trivial when returning many tuples,
or scans go to disk.That's true, but joins where rows are filtered after the join condition
will win here too:
Yes, when not many tuples was returned, the gain remains in
inverse proportion to the number of the returning rows. Heavy
(time-consuming) on-memory join returning several rows should
have gain from this.
Also queries with a GROUP BY clause. (Since grouping is always performed
after the join :-( )
And in many cases clinged by additional sorts:p As such, so many
factors affect the performance so improvements which give linear
performance gain are often difficult to gain approval to being
applied. I suppose we need more evidence how and in what
situation we will receive the gain.
I haven't measured the loss by additional computation when the
query is regarded as not a "single join".
I think this would be hard to measure, but likely if it is measurable then
you'd want to look at just planning time, rather than planning and
execution time.
From the another point of view, the patch looks a bit large for
the gain (for me). Addition to it, it loops by many levels.
[mark_unique_joins()]
foreach (joinlist)
[eclassjoin_is_unique_join()]
foreach(joinlist)
foreach(root->eq_classes)
foreach(root->ec_members)
The innermost loop could be roughly said to iterate about 10^3*2
times for a join of 10 tables all of which have index and no
eclass is shared among them. From my expreince, we must have the
same effect by far less iteration levels.
This is caueed the query like this, as a bad case.
create table t1 (a int, b int);
create index i_t1 (a int);
create table t2 (like t1 including indexes);
....
create table t10 (.....);
insert into t1 (select a, a * 10 from generate_series(0, 100) a);
insert into t2 (select a * 10, a * 20 from generate_series(0, 100) a);
...
insert into t10 (select a * 90, a * 100 from generate_series(0, 100) a);
explain select t1.a, t10.b from t1 join t2 on (t1.b = t2.a) join t3 on (t2.b = t3.a) join t4 on (t3.b = t4.a) join t5 on (t4.b = t5.a) join t6 on (t5.b = t6.a) join t7 on (t6.b = t7.a) join t8 on (t7.b = t8.a) join t9 on (t8.b = t9.a) join t10 on (t9.b = t10.a);
The head takes 3ms for planning and the patched version takes
around 5ms while pure execution time is 1ms.. I think it is a too
long extra time.
Explain representation:
I don't like that the 'Unique Join:" occupies their own lines in
the result of explain, moreover, it doesn't show the meaning
clearly. What about the representation like the following? Or,
It might not be necessary to appear there.Nested Loop ... Output: .... - Unique Jion: Yes -> Seq Scan on public.t2 (cost = ... - -> Seq Scan on public.t1 (cost = .... + -> Seq Scan on public.t1 (unique) (cost = ....Yeah I'm not too big a fan of this either. I did at one evolution of the
patch I had "Unique Left Join" in the join node's line in the explain
output, but I hated that more and changed it just to be just in the VERBOSE
output, and after I did that I didn't want to change the join node's line
only when in verbose mode. I do quite like that it's a separate item for
the XML and JSON explain output. That's perhaps quite useful when the
explain output must be processed by software.I'm totally open for better ideas on names, but I just don't have any at
the moment.
"Unique Left Join" looks too bold. Anyway changing there is
rather easier.
Coding:
The style looks OK. Could applied on master.
It looks to work as you expected for some cases.Random comments follow.
- Looking specialjoin_is_unique_join, you seem to assume that
!delay_upper_joins is a sufficient condition for not being
"unique join". The former simplly indicates that "don't
commute with upper OJs" and the latter seems to indicate that
"the RHS is unique", Could you tell me what kind of
relationship is there between them?The rationalisation around that are from the (now changed) version of
join_is_removable(), where the code read:/*
* Must be a non-delaying left join to a single baserel, else we aren't
* going to be able to do anything with it.
*/
if (sjinfo->jointype != JOIN_LEFT ||
sjinfo->delay_upper_joins)
return false;I have to admit that I didn't go and investigate why delayed upper joins
cannot be removed by left join removal code, I really just assumed that
we're just unable to prove that a join to such a relation won't match more
than one outer side's row. I kept this to maintain that behaviour as I
assumed it was there for a good reason.
Yes, I suppose that you thought so and it should work as expected
as a logical calculation for now. But delay_upper_joins is the
condition for not being allowed to be removed and the meaning
looks not to have been changed, right? If so, I think they should
be treated according to their right meanings.
Specifically, delay_upper_joins should be removed from the
condition for unique_join to the original location, at the very
first of join_is_removable.
- The naming "unique join" seems a bit obscure for me, but I
don't have no better idea:( However, the member name
"has_unique_join" seems to be better to be "is_unique_join".Yeah, I agree with this. I just can't find something short enough that
means "based on the join condition, the inner side of the join will never
produce more than 1 row for any single outer row". Unique join was the best
I came up with. I'm totally open for better ideas.
It should be good to write the precise definition of "unique
join" in the appropriate comment.
I agree that is_unique_join is better than has_unique_join. I must have
just copied the has_eclass_joins struct member without thinking too hard
about it. I've now changed this in my local copy of the patch.- eclassjoin_is_unique_join involves seemingly semi-exhaustive
scan on ec members for every element in joinlist. Even if not,
it might be thought to be a bit too large for the gain. Could
you do the equivelant things by more small code?I'd imagine some very complex queries could have many equivalence classes
and members, though I hadn't really thought that this number would be so
great that processing here would suffer much. There's quite a few fast
paths out, like the test to ensure that both rels are mentioned
in ec_relids. Though for a query which is so complex to have a great number
of eclass' and members, likely there would be quite a few relations
involved and planning would be slower anyway. I'm not quite sure how else I
could write this and still find the unique join cases each time. We can't
really just give up half way through looking.
A rough estimate for a bad case is mentioned above. I had the
similar comment about the complexity for code to find implicitly
uniquely ordered join peer. Perhaps there's many improvable
points implemented easily by looping over join_list and
eclasses. But we would shuold do them without piled loops.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10 March 2015 at 19:19, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:
From the another point of view, the patch looks a bit large for
the gain (for me). Addition to it, it loops by many levels.[mark_unique_joins()]
foreach (joinlist)
[eclassjoin_is_unique_join()]
foreach(joinlist)
foreach(root->eq_classes)
foreach(root->ec_members)The innermost loop could be roughly said to iterate about 10^3*2
times for a join of 10 tables all of which have index and no
eclass is shared among them. From my expreince, we must have the
same effect by far less iteration levels.This is caueed the query like this, as a bad case.
create table t1 (a int, b int);
create index i_t1 (a int);
create table t2 (like t1 including indexes);
....
create table t10 (.....);
insert into t1 (select a, a * 10 from generate_series(0, 100) a);
insert into t2 (select a * 10, a * 20 from generate_series(0, 100) a);
...
insert into t10 (select a * 90, a * 100 from generate_series(0, 100) a);explain select t1.a, t10.b from t1 join t2 on (t1.b = t2.a) join t3 on
(t2.b = t3.a) join t4 on (t3.b = t4.a) join t5 on (t4.b = t5.a) join t6 on
(t5.b = t6.a) join t7 on (t6.b = t7.a) join t8 on (t7.b = t8.a) join t9 on
(t8.b = t9.a) join t10 on (t9.b = t10.a);The head takes 3ms for planning and the patched version takes
around 5ms while pure execution time is 1ms.. I think it is a too
long extra time.
This smells quite fishy to me. I'd be quite surprised if your machine took
an extra 2 ms to do this.
I've run what I think is the same test on my 5 year old i5 laptop and
attached the .sql file which I used to generate the same schema as you've
described.
I've also attached the results of the explain analyze "Planning Time:"
output from patched and unpatched using your test case.
I was unable to notice any difference in plan times between both versions.
In fact master came out slower, which is likely just the noise in the
results.
Just to see how long mark_unique_joins() takes with your test case I
changed query_planner() to call mark_unique_joins() 1 million times:
{
int x;
for (x = 0; x < 1000000;x++)
mark_unique_joins(root, joinlist);
}
I also got rid of the fast path test which bails out if the join is already
marked as unique.
/* check if we've already marked this join as unique on a previous call */
/*if (idxrel->is_unique_join)
return true;
*/
On my machine after making these changes, it takes 800 ms to plan the
query. So it seems that's around 800 nano seconds for a single call to
mark_unique_joins().
Perhaps you've accidentally compiled the patched version with debug asserts?
Are you able to retest this?
Regards
David Rowley
Hello. The performance drag was not so large after all.
For all that, I agree that the opition that this kind of separate
multiple-nested loops on relations, joins or ECs and so on for
searching something should be avoided. I personally feel that
additional time to such an extent (around 1%) would be tolerable
if it affected a wide range of queries or it brought more obvious
gain.
Unfortunately I can't decide this to be 'ready for commiter' for
now. I think we should have this on smaller footprint, in a
method without separate exhauxtive searching. I also had very
similar problem in the past but I haven't find such a way for my
problem..
At Wed, 11 Mar 2015 01:32:24 +1300, David Rowley <dgrowleyml@gmail.com> wrote in <CAApHDvpEXAjs6mV2ro4=3qbzpx=pLrteinX0J2YHq6wrp85pPw@mail.gmail.com>
explain select t1.a, t10.b from t1 join t2 on (t1.b = t2.a) join t3 on
(t2.b = t3.a) join t4 on (t3.b = t4.a) join t5 on (t4.b = t5.a) join t6 on
(t5.b = t6.a) join t7 on (t6.b = t7.a) join t8 on (t7.b = t8.a) join t9 on
(t8.b = t9.a) join t10 on (t9.b = t10.a);The head takes 3ms for planning and the patched version takes
around 5ms while pure execution time is 1ms.. I think it is a too
long extra time.This smells quite fishy to me. I'd be quite surprised if your machine took
an extra 2 ms to do this.
You're right. Sorry. I was amazed by the numbers..
I took again the times for both master and patched on master
(some conflict arised). Configured with no options so compiled
with -O2 and no assertions. Measured the planning time for the
test query 10 times and calcualted the average.
patched: 1.883ms (stddev 0.034)
master: 1.861ms (stddev 0.042)
About 0.02ms, 1% extra time looks to be taken by the extra
processing.
regards,
I've run what I think is the same test on my 5 year old i5 laptop and
attached the .sql file which I used to generate the same schema as you've
described.I've also attached the results of the explain analyze "Planning Time:"
output from patched and unpatched using your test case.I was unable to notice any difference in plan times between both versions.
In fact master came out slower, which is likely just the noise in the
results.Just to see how long mark_unique_joins() takes with your test case I
changed query_planner() to call mark_unique_joins() 1 million times:{
int x;
for (x = 0; x < 1000000;x++)
mark_unique_joins(root, joinlist);
}I also got rid of the fast path test which bails out if the join is already
marked as unique./* check if we've already marked this join as unique on a previous call */
/*if (idxrel->is_unique_join)
return true;
*/On my machine after making these changes, it takes 800 ms to plan the
query. So it seems that's around 800 nano seconds for a single call to
mark_unique_joins().Perhaps you've accidentally compiled the patched version with debug asserts?
Are you able to retest this?
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 13 March 2015 at 20:34, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello. The performance drag was not so large after all.
Great, thanks for retesting this.
For all that, I agree that the opition that this kind of separate
multiple-nested loops on relations, joins or ECs and so on for
searching something should be avoided. I personally feel that
additional time to such an extent (around 1%) would be tolerable
if it affected a wide range of queries or it brought more obvious
gain.
Do you have any ideas on an implementation of how we can avoid checking all
eclasses for each combination of joins?
The only thing I can think of would be to add a List of eclasses on
RelOptInfo for all eclasses that the RelOptInfo has. That would save the
looping over every single eclass in eclassjoin_is_unique_join() and save
from having to check if the relation is mentioned in the eclass before
continuing. I could certainly make this change, but I'd be worried that it
would just add bloat to the patch and cause a committed to question it.
I could also almost completely remove the extra plan time of your test case
by either adding a proper pre-check to ensure the relation has unique
indexes, rather than just indexes or I could add a new list to RelOptInfo
which stores unique index, then I could just check if that's NIL before
going any further in eclassjoin_is_unique_join(), but I come from a world
where every relation has a primary key, so I'd just imagine that would not
be hit in enough real cases. I'd imagine that pre-check is only there
because it's so cheap in the first place.
For testing, I added some code to mark_unique_joins() to spit out a NOTICE:
if (eclassjoin_is_unique_join(root, joinlist, rtr))
{
root->simple_rel_array[rtr->rtindex]->is_unique_join = true;
elog(NOTICE, "Unique Join: Yes");
}
else
elog(NOTICE, "Unique Join: No");
and the same below for special joins too.
On running the regression tests I see:
"Unique Join: Yes" 1557 times
"Unique Join: No" 11563 times
I would imagine the regression tests are not the best thing to base this
one as they tend to exercise more unusual cases.
I have an OLTP type application here which I will give a bit of exercise
and see how that compares.
Unfortunately I can't decide this to be 'ready for commiter' for
now. I think we should have this on smaller footprint, in a
method without separate exhauxtive searching. I also had very
similar problem in the past but I haven't find such a way for my
problem..
I don't think it's ready yet either. I've just been going over a few things
and looking at Tom's recent commit b557226 in pathnode.c I've got a feeling
that this patch would need to re-factor some code that's been modified
around the usage of relation_has_unique_index_for() as when this code is
called, the semi joins have already been analysed to see if they're unique,
so it could be just a case of ripping all of that out
of create_unique_path() and just putting a check to say rel->is_unique_join
in there. But if I do that then I'm not quite sure if
SpecialJoinInfo->semi_rhs_exprs and SpecialJoinInfo->semi_operators would
still be needed at all. These were only added in b557226. Changing this
would help reduce the extra planning time when the query contains
semi-joins. To be quite honest, this type of analysis belongs in
analyzejoin.c anyway. I'm tempted to hack at this part some more, but I'd
rather Tom had a quick glance at what I'm trying to do here first.
At Wed, 11 Mar 2015 01:32:24 +1300, David Rowley <dgrowleyml@gmail.com>
wrote in <CAApHDvpEXAjs6mV2ro4=3qbzpx=
pLrteinX0J2YHq6wrp85pPw@mail.gmail.com>explain select t1.a, t10.b from t1 join t2 on (t1.b = t2.a) join t3 on
(t2.b = t3.a) join t4 on (t3.b = t4.a) join t5 on (t4.b = t5.a) joint6 on
(t5.b = t6.a) join t7 on (t6.b = t7.a) join t8 on (t7.b = t8.a) join
t9 on
(t8.b = t9.a) join t10 on (t9.b = t10.a);
The head takes 3ms for planning and the patched version takes
around 5ms while pure execution time is 1ms.. I think it is a too
long extra time.This smells quite fishy to me. I'd be quite surprised if your machine
took
an extra 2 ms to do this.
You're right. Sorry. I was amazed by the numbers..
I took again the times for both master and patched on master
(some conflict arised). Configured with no options so compiled
with -O2 and no assertions. Measured the planning time for the
test query 10 times and calcualted the average.patched: 1.883ms (stddev 0.034)
master: 1.861ms (stddev 0.042)About 0.02ms, 1% extra time looks to be taken by the extra
processing.
Is this still using the same test query as I attached in my previous email?
This is still 25 times more of a slowdown as to what I witnessed by calling
mark_unique_joins() 1 million times in a tight loop, but of course much
closer to what I would have thought. You're getting 20,000 nanoseconds and
I'm getting 800 nanoseconds, but our overall planning times are very
similar, so I assume our processors are of similar speeds.
It's certainly difficult to know if the extra planning will pay off enough
for it to reduce total CPU time between both planning and executing the
query. There actually are cases where the planning time will be reduced by
this patch. In the case when a LEFT JOIN is removed the master code
currently goes off and re-checks all relations to see if the removal of
that 1 relation has caused it to be possible for other relations to be
removed, and this currently involves analysing unique indexes again to see
if the join is still unique. With this patch I've cut out most of the work
which is done in join_is_removable(), it no longer does that unique
analysis of the join condition as that's done in mark_unique_joins(). The
cases that should be faster are, if a join is already marked as unique then
the code will never test that again.
I have quite a few other ideas lined up which depend on knowing if a join
is unique, all these ideas naturally depend on this patch. Really the
5%-10% join speed-up was the best excuse that I could find have this work
actually accepted. My first email in this thread explains some of my other
ideas.
In addition to those ideas I've been thinking about how that if we could
determine that a plan returns a single row, e.g a key lookup then we could
likely safely cache those plans, as there were never be any data skew in
these situations. I have no immediate plans to go and work on this, but
quite possibly I will in the future if I manage to get unique joins in
first.
Regards
David Rowley
On 14 March 2015 at 14:51, David Rowley <dgrowleyml@gmail.com> wrote:
On 13 March 2015 at 20:34, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:For all that, I agree that the opition that this kind of separate
multiple-nested loops on relations, joins or ECs and so on for
searching something should be avoided. I personally feel that
additional time to such an extent (around 1%) would be tolerable
if it affected a wide range of queries or it brought more obvious
gain.For testing, I added some code to mark_unique_joins() to spit out a NOTICE:
if (eclassjoin_is_unique_join(root, joinlist, rtr))
{
root->simple_rel_array[rtr->rtindex]->is_unique_join = true;
elog(NOTICE, "Unique Join: Yes");
}
else
elog(NOTICE, "Unique Join: No");and the same below for special joins too.
On running the regression tests I see:
"Unique Join: Yes" 1557 times
"Unique Join: No" 11563 times
With this notice emitting code in place, I opened up pgAdmin and had a
click around for a few minutes.
If I search the log file I see:
Unique Join: No 940 times
Unique Join: Yes 585 times
It seems that joins with a unique inner side are quite common here.
Regards
David Rowley
On 14 March 2015 at 14:51, David Rowley <dgrowleyml@gmail.com> wrote:
On 13 March 2015 at 20:34, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:Unfortunately I can't decide this to be 'ready for commiter' for
now. I think we should have this on smaller footprint, in a
method without separate exhauxtive searching. I also had very
similar problem in the past but I haven't find such a way for my
problem..I don't think it's ready yet either. I've just been going over a few
things and looking at Tom's recent commit b557226 in pathnode.c I've got a
feeling that this patch would need to re-factor some code that's been
modified around the usage of relation_has_unique_index_for() as when this
code is called, the semi joins have already been analysed to see if they're
unique, so it could be just a case of ripping all of that out
of create_unique_path() and just putting a check to say rel->is_unique_join
in there. But if I do that then I'm not quite sure if
SpecialJoinInfo->semi_rhs_exprs and SpecialJoinInfo->semi_operators would
still be needed at all. These were only added in b557226. Changing this
would help reduce the extra planning time when the query contains
semi-joins. To be quite honest, this type of analysis belongs in
analyzejoin.c anyway. I'm tempted to hack at this part some more, but I'd
rather Tom had a quick glance at what I'm trying to do here first.
I decided to hack away any change the code Tom added in b557226. I've
changed it so that create_unique_path() now simply just uses if
(rel->is_unique_join), instead off all the calls to
relation_has_unique_index_for() and query_is_distinct_for(). This vastly
simplifies that code. One small change is that Tom's checks for uniqueness
on semi joins included checks for volatile functions, this check didn't
exist in the original join removal code, so I've left it out. We'll never
match a expression with a volatile function to a unique index as indexes
don't allow volatile function expressions anyway. So as I understand it
this only serves as a fast path out if the join condition has a volatile
function... But I'd assume that check is not all that cheap.
I ended up making query_supports_distinctness() and query_is_distinct_for()
static in analyzejoins.c as they're not used in any other files.
relation_has_unique_index_for() is also now only used in analyzejoins.c,
but I've not moved it into that file yet as I don't want to bloat the
patch. I just added a comment to say it needs moved.
I've also added a small amount of code to query_is_distinct_for() which
allows subqueries such as (select 1 a offset 0) to be marked as unique. I
thought it was a little silly that these were not being detected as unique,
so I fixed it. This has the side effect of allowing left join removals for
queries such as: select t1.* from t1 left join (select 1 a offset 0) a on
t1.id=a.a;
Updated patch attached.
Regards
David Rowley
Attachments:
unijoin_2015-03-14_81bd96a.patchapplication/octet-stream; name=unijoin_2015-03-14_81bd96a.patchDownload+907-143
Hello, I don't have enough time for now but made some
considerations on this.
It might be a way that marking unique join peer at bottom and
propagate up, not searching from top of join list.
Around create_join_clause might be a candidate for it.
I'll investigate that later.
regards,
At Sat, 14 Mar 2015 23:05:24 +1300, David Rowley <dgrowleyml@gmail.com> wrote in <CAApHDvoh-EKF51QQyNoJUe0eHYMZw6OzJjjgYP63Cmw7QfebjA@mail.gmail.com>
dgrowleyml> On 14 March 2015 at 14:51, David Rowley <dgrowleyml@gmail.com> wrote:
dgrowleyml>
dgrowleyml> > On 13 March 2015 at 20:34, Kyotaro HORIGUCHI <
dgrowleyml> > horiguchi.kyotaro@lab.ntt.co.jp> wrote:
dgrowleyml> >
dgrowleyml> >> Unfortunately I can't decide this to be 'ready for commiter' for
dgrowleyml> >>
dgrowleyml> > now. I think we should have this on smaller footprint, in a
dgrowleyml> >> method without separate exhauxtive searching. I also had very
dgrowleyml> >> similar problem in the past but I haven't find such a way for my
dgrowleyml> >> problem..
dgrowleyml> >>
dgrowleyml> >>
dgrowleyml> > I don't think it's ready yet either. I've just been going over a few
dgrowleyml> > things and looking at Tom's recent commit b557226 in pathnode.c I've got a
dgrowleyml> > feeling that this patch would need to re-factor some code that's been
dgrowleyml> > modified around the usage of relation_has_unique_index_for() as when this
dgrowleyml> > code is called, the semi joins have already been analysed to see if they're
dgrowleyml> > unique, so it could be just a case of ripping all of that out
dgrowleyml> > of create_unique_path() and just putting a check to say rel->is_unique_join
dgrowleyml> > in there. But if I do that then I'm not quite sure if
dgrowleyml> > SpecialJoinInfo->semi_rhs_exprs and SpecialJoinInfo->semi_operators would
dgrowleyml> > still be needed at all. These were only added in b557226. Changing this
dgrowleyml> > would help reduce the extra planning time when the query contains
dgrowleyml> > semi-joins. To be quite honest, this type of analysis belongs in
dgrowleyml> > analyzejoin.c anyway. I'm tempted to hack at this part some more, but I'd
dgrowleyml> > rather Tom had a quick glance at what I'm trying to do here first.
dgrowleyml> >
dgrowleyml> >
dgrowleyml>
dgrowleyml> I decided to hack away any change the code Tom added in b557226. I've
dgrowleyml> changed it so that create_unique_path() now simply just uses if
dgrowleyml> (rel->is_unique_join), instead off all the calls to
dgrowleyml> relation_has_unique_index_for() and query_is_distinct_for(). This vastly
dgrowleyml> simplifies that code. One small change is that Tom's checks for uniqueness
dgrowleyml> on semi joins included checks for volatile functions, this check didn't
dgrowleyml> exist in the original join removal code, so I've left it out. We'll never
dgrowleyml> match a expression with a volatile function to a unique index as indexes
dgrowleyml> don't allow volatile function expressions anyway. So as I understand it
dgrowleyml> this only serves as a fast path out if the join condition has a volatile
dgrowleyml> function... But I'd assume that check is not all that cheap.
dgrowleyml>
dgrowleyml> I ended up making query_supports_distinctness() and query_is_distinct_for()
dgrowleyml> static in analyzejoins.c as they're not used in any other files.
dgrowleyml> relation_has_unique_index_for() is also now only used in analyzejoins.c,
dgrowleyml> but I've not moved it into that file yet as I don't want to bloat the
dgrowleyml> patch. I just added a comment to say it needs moved.
dgrowleyml>
dgrowleyml> I've also added a small amount of code to query_is_distinct_for() which
dgrowleyml> allows subqueries such as (select 1 a offset 0) to be marked as unique. I
dgrowleyml> thought it was a little silly that these were not being detected as unique,
dgrowleyml> so I fixed it. This has the side effect of allowing left join removals for
dgrowleyml> queries such as: select t1.* from t1 left join (select 1 a offset 0) a on
dgrowleyml> t1.id=a.a;
dgrowleyml>
dgrowleyml> Updated patch attached.
dgrowleyml>
dgrowleyml> Regards
dgrowleyml>
dgrowleyml> David Rowley
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello, The attached is non-search version of unique join.
It is not fully examined but looks to work as expected. Many
small changes make the patch larger but affected area is rather
small.
What do you think about this?
Hello, I don't have enough time for now but made some
considerations on this.It might be a way that marking unique join peer at bottom and
propagate up, not searching from top of join list.
Around create_join_clause might be a candidate for it.
I'll investigate that later.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center