Mistake in documentation for CREATE STATISTICS
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/11/sql-createstatistics.html
Description:
The example on https://www.postgresql.org/docs/11/sql-createstatistics.html
seems wrong: instead of "EXPLAIN ANALYZE SELECT ..." it should say "EXPLAIN
SELECT ...". If changed that way, the query optimizer estimates the number
of rows at 1 and then 100 after statistics are created which is indeed how
many the query returns. In the current wording, with ANALYZE, the query
optimizer estimates the number of rows at 100 for both the first and second
select.
Tested on PostgreSQL 11.
PG Doc comments form <noreply@postgresql.org> writes:
The example on https://www.postgresql.org/docs/11/sql-createstatistics.html
seems wrong: instead of "EXPLAIN ANALYZE SELECT ..." it should say "EXPLAIN
SELECT ...". If changed that way, the query optimizer estimates the number
of rows at 1 and then 100 after statistics are created which is indeed how
many the query returns. In the current wording, with ANALYZE, the query
optimizer estimates the number of rows at 100 for both the first and second
select.
I think you're mistaking actual rows for estimated rows in the output
of EXPLAIN ANALYZE.
The point of using EXPLAIN ANALYZE here is just to make it easier to see
that the estimate is indeed wrong, by providing the actual count alongside
the estimate. I don't think that taking out ANALYZE would be an
improvement. For instance, in the first EXPLAIN I get
Gather (cost=1000.00..11675.10 rows=1 width=8) (actual time=1.215..45.218 rows
=100 loops=1)
which without ANALYZE would just be
Gather (cost=1000.00..11675.10 rows=1 width=8)
The rows=1 estimate is equally wrong either way, but you don't get to
see the correct value without ANALYZE.
regards, tom lane