Nested Loop Left Join always shows rows=1

Started by Leonardo Francalanciover 17 years ago5 messageshackersgeneral

Jump to latest

Leonardo Francalanci

m_lists@yahoo.it

over 17 years ago

hackersgeneral

Query is:

"explain analyze select sum(A), count(*) from tab1 left outer join
tab_outer on id=out_id where id=10;"

output:

Aggregate (cost=31.91..31.92 rows=1 width=4) (actual
time=14.185..14.185 rows=1 loops=1)
-> Nested Loop Left Join (cost=0.00..17.11 rows=2959 width=4)
(actual time=8.608..13.400 rows=2953 loops=1)
-> Index Scan using id_idx on tab1 (cost=0.00..8.27 rows=1
width=4) (actual time=0.010..0.011 rows=1 loops=1)
Index Cond: (id = 10)
-> Index Scan using out_id_idx on tab_outer (cost=0.00..8.83
rows=1 width=8) (actual time=8.590..11.924 rows=2953 loops=1)
Index Cond: ((tab_outer.out_id = 10) AND (tab1.id =
tab_outer.out_id))

http://explain-analyze.info/query_plans/2519-query-plan-1240

I ran ANALYZE on both tables, but the plan shows rows=1 in the inner
table results.

If I change the "left outer join" into an "inner join" the bitmap index
scan of the inner table of the nested loop shows a correct number of
rows (not the exact same thing as "actual rows", but very close).

Am I reading the explain output in the wrong way?
I'm not complaining about performance issues, I'm just trying to learn
how to read plans.

Tom Lane

tgl@sss.pgh.pa.us

over 17 years ago

In reply to: Leonardo Francalanci (#1)

hackersgeneral

Re: Nested Loop Left Join always shows rows=1

Scara Maccai <m_lists@yahoo.it> writes:

Aggregate (cost=31.91..31.92 rows=1 width=4) (actual
time=14.185..14.185 rows=1 loops=1)
-> Nested Loop Left Join (cost=0.00..17.11 rows=2959 width=4)
(actual time=8.608..13.400 rows=2953 loops=1)
-> Index Scan using id_idx on tab1 (cost=0.00..8.27 rows=1
width=4) (actual time=0.010..0.011 rows=1 loops=1)
Index Cond: (id = 10)
-> Index Scan using out_id_idx on tab_outer (cost=0.00..8.83
rows=1 width=8) (actual time=8.590..11.924 rows=2953 loops=1)
Index Cond: ((tab_outer.out_id = 10) AND (tab1.id =
tab_outer.out_id))

This seems a bit broken :-( ... ideally it shouldn't be generating the
redundant index condition, either. I think the redundant condition is
probably indirectly responsible for the low rowcount estimate, too.
It doesn't really hurt anything, since the join output estimate is
correct, but it'd be nice to make it look better.

regards, tom lane

Leonardo Francalanci

m_lists@yahoo.it

over 17 years ago

In reply to: Tom Lane (#2)

hackersgeneral

Re: Nested Loop Left Join always shows rows=1

Tom Lane wrote:

Scara Maccai <m_lists@yahoo.it> writes:

-> Index Scan using id_idx on tab1 (cost=0.00..8.27 rows=1
width=4) (actual time=0.010..0.011 rows=1 loops=1)
Index Cond: (id = 10)
-> Index Scan using out_id_idx on tab_outer (cost=0.00..8.83
rows=1 width=8) (actual time=8.590..11.924 rows=2953 loops=1)
Index Cond: ((tab_outer.out_id = 10) AND (tab1.id =
tab_outer.out_id))

This seems a bit broken :-( ... ideally it shouldn't be generating the
redundant index condition, either.

Why is the index condition redundant? I guess the redundant part is

"AND (tab1.id = tab_outer.out_id)" but I'm not sure, and I would really appreciate if you could explain a little bit more...

The reason I'm asking is because I'm looking at the implementation of some VERY basic form of progress indicator based on "estimated rows vs rows already processed".
I guess this would work for our DB because the planner is usually very good at guessing the number of rows for each step of the plan, since our distributions are pretty simple.

But this left outer join thing I'm afraid would invalidate the method, since the inner table row count is always off...

Thank you for the reply.

(I added hackers too, since this seems a place where some
fixing/development would be necessary. I hope it's ok.)

Tom Lane

tgl@sss.pgh.pa.us

over 17 years ago

In reply to: Leonardo Francalanci (#1)

hackersgeneral

Re: Nested Loop Left Join always shows rows=1

Scara Maccai <m_lists@yahoo.it> writes:

I ran ANALYZE on both tables, but the plan shows rows=1 in the inner
table results.

If I change the "left outer join" into an "inner join" the bitmap index
scan of the inner table of the nested loop shows a correct number of
rows (not the exact same thing as "actual rows", but very close).

Yeah, this is a bug: it's effectively double-counting the selectivity of
the index clause. The error isn't enormously critical, since the join
size estimate is right; but it could perhaps lead to choosing a plain
indexscan when a bitmap scan would be better. I've applied a patch.

regards, tom lane

Leonardo Francalanci

m_lists@yahoo.it

over 17 years ago

In reply to: Tom Lane (#4)

general

Re: Nested Loop Left Join always shows rows=1

Yeah, this is a bug: it's effectively double-counting the
selectivity of the index clause. The error isn't enormously critical,
since the join size estimate is right; but it could perhaps lead to
choosing a plain indexscan when a bitmap scan would be better. I've
applied a patch.

Thank you. I'll try that.

I was able to change the PgBackendStatus struct to hold a "percentage of completion" field, which shows up when calling the pg_stat_get_activity function.

As I said in a previous mail, the progress indicator gives very good estimates for the simple queries I need. If I can come up with something that is good in general I'll post it. I'm reading some papers about the argument (mostly the ones listed in http://wiki.postgresql.org/wiki/Query_progress_indication).

Import Notes

Resolved by subject fallback