BUG #15717: Index not used when ordering by left joined table column

Started by PG Bug reporting formabout 7 years ago3 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 15717
Logged by: Alex
Email address: cdalxndr@yahoo.com
PostgreSQL version: 10.0
Operating system: Windows 10
Description:

The following query runs slow, as it seq scans tables, without using any
index:
select
wp.id as id
from
web_page wp
left join web_url_path url on wp.id=url.page
order by
url.priority asc
limit 1

Explain analyze:
"Limit (cost=180140.55..180140.55 rows=1 width=8) (actual
time=2402.310..2402.310 rows=1 loops=1)"
" -> Sort (cost=180140.55..185578.31 rows=2175106 width=8) (actual
time=2402.309..2402.309 rows=1 loops=1)"
" Sort Key: url.priority"
" Sort Method: top-N heapsort Memory: 25kB"
" -> Hash Right Join (cost=68981.25..169265.02 rows=2175106
width=8) (actual time=588.010..2232.158 rows=2159417 loops=1)"
" Hash Cond: (url.page = wp.id)"
" -> Seq Scan on web_url_path url (cost=0.00..47376.06
rows=2175106 width=8) (actual time=0.044..403.351 rows=2175106 loops=1)"
" -> Hash (cost=33480.89..33480.89 rows=2163789 width=4)
(actual time=581.473..581.473 rows=2158888 loops=1)"
" Buckets: 131072 Batches: 32 Memory Usage: 3403kB"
" -> Seq Scan on web_page wp (cost=0.00..33480.89
rows=2163789 width=4) (actual time=0.076..251.148 rows=2158888 loops=1)"
"Planning time: 0.295 ms"
"Execution time: 2402.956 ms"

Note that I have indexes (btree) on columns "web_url_path.priority" and
"web_url_path.page".
I was expecting the "priority" column index would be used to retrieve first
url then do a reverse join to get the corresponding page, and continue if no
page is found.

Note that the following query uses the "priority" index:
select u.id
from web_url_path u
order by u.priority
limit 1

Explain analyze:
"Limit (cost=0.43..0.50 rows=1 width=8) (actual time=0.014..0.014 rows=1
loops=1)"
" -> Index Scan using web_url_path_priority_idx on web_url_path u
(cost=0.43..144920.91 rows=2175106 width=8) (actual time=0.014..0.014 rows=1
loops=1)"
"Planning time: 0.114 ms"
"Execution time: 0.026 ms"

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: PG Bug reporting form (#1)
Re: BUG #15717: Index not used when ordering by left joined table column

PG Bug reporting form <noreply@postgresql.org> writes:

The following query runs slow, as it seq scans tables, without using any
index:
select
wp.id as id
from
web_page wp
left join web_url_path url on wp.id=url.page
order by
url.priority asc
limit 1

I was expecting the "priority" column index would be used to retrieve first
url then do a reverse join to get the corresponding page, and continue if no
page is found.

It would probably do that if you used a plain join, or a right join.
But a nestloop plan cannot be driven from the inside of an outer join,
and that's the case this query presents.

regards, tom lane

#3Alex
cdalxndr@yahoo.com
In reply to: Tom Lane (#2)
Re: BUG #15717: Index not used when ordering by left joined table column

Indeed, with an inner join the index is used.Thanks for the assistance.

On Thursday, March 28, 2019, 3:19:30 AM GMT+2, Tom Lane <tgl@sss.pgh.pa.us> wrote:

PG Bug reporting form <noreply@postgresql.org> writes:

The following query runs slow, as it seq scans tables, without using any
index:
  select
      wp.id as id
  from
      web_page wp
      left join web_url_path url on wp.id=url.page
  order by
      url.priority asc
  limit 1

I was expecting the "priority" column index would be used to retrieve first
url then do a reverse join to get the corresponding page, and continue if no
page is found.

It would probably do that if you used a plain join, or a right join.
But a nestloop plan cannot be driven from the inside of an outer join,
and that's the case this query presents.

            regards, tom lane