Explain output: wrong row count?

Started by Alvaro Herreraabout 21 years ago2 messages
#1Alvaro Herrera
alvherre@dcc.uchile.cl

I have this EXPLAIN ANALYZE output:

alvherre=# explain analyze select * from oficina join ciudad using
(codciudad) where codtipoofi in (4,5);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Hash Join (cost=15.24..68.76 rows=621 width=394) (actual time=6.003..22.175 rows=641 loops=1)
Hash Cond: ("outer".codciudad = "inner".codciudad)
-> Seq Scan on oficina (cost=0.00..42.66 rows=621 width=309) (actual time=0.013..3.103 rows=641 loops=1)
Filter: ((codtipoofi = 4) OR (codtipoofi = 5))
-> Hash (cost=13.99..13.99 rows=499 width=89) (actual time=5.947..5.947 rows=0 loops=1)
-> Seq Scan on ciudad (cost=0.00..13.99 rows=499 width=89) (actual time=0.018..3.909 rows=499 loops=1)
Total runtime: 24.448 ms
(7 rows)

Note that the Hash step has an estimated row count of 499 (which is a
good estimate), but the actual row count is 0, which is way off. But,
the outer Hash Join step has a good estimate.

I wonder if the rows=0 is a bug, or is that number meant to be
interpreted in some special way? It is the same on both 7.4.6 and
8.0beta5. (Now that I look, it's 0 in all Hash steps I have at sight
... I had never noticed before!)

This query doesn't actually affect me a lot, but it is part of a bigger
query whose estimation is way off. I won't post it here because it's
topic for pgsql-performance ...

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"No reniegues de lo que alguna vez cre�ste"

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#1)
Re: Explain output: wrong row count?

Alvaro Herrera <alvherre@dcc.uchile.cl> writes:

I have this EXPLAIN ANALYZE output:
...
-> Hash (cost=13.99..13.99 rows=499 width=89) (actual time=5.947..5.947 rows=0 loops=1)
-> Seq Scan on ciudad (cost=0.00..13.99 rows=499 width=89) (actual time=0.018..3.909 rows=499 loops=1)

Note that the Hash step has an estimated row count of 499 (which is a
good estimate), but the actual row count is 0, which is way off.

The "actual rows" count on a Hash node is always 0, because we don't
retrieve rows from it in the normal way (ie via ExecProcNode). The
actual row count from the child plan node (here, the 499 for the
seqscan) tells you how many rows got loaded into the hashtable.

I've occasionally thought about putting in some kind of kluge to set the
"actual rows" count to the hash table size, just because it looks funny
to have zero there. But it hasn't got to the top of the to-do list ...

regards, tom lane