judging acceptable discrepancy in row count v. estimate

Started by Rob Sargentover 7 years ago4 messagesgeneral
Jump to latest
#1Rob Sargent
robjsargent@gmail.com

Should reality be half again as large as the estimated row count?

coon=# select count(*) from sui.segment;
count
----------
49,942,837 -- my commas
(1 row)

coon=# vacuum (analyse, verbose) sui.probandset;
INFO: vacuuming "sui.probandset"
INFO: scanned index "probandset_pkey" to remove 3122 row versions
DETAIL: CPU: user: 4.70 s, system: 1.45 s, elapsed: 26.97 s
INFO: scanned index "probandsetunique" to remove 3122 row versions
DETAIL: CPU: user: 5.99 s, system: 10.24 s, elapsed: 97.42 s
INFO: "probandset": removed 3122 row versions in 1951 pages
DETAIL: CPU: user: 0.04 s, system: 0.00 s, elapsed: 0.73 s
INFO: index "probandset_pkey" now contains 33655227 row versions in 259175 pages
DETAIL: 3122 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.05 s.
INFO: index "probandsetunique" now contains 33655227 row versions in 1231894 pages
DETAIL: 3121 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.05 s.
INFO: "probandset": found 890 removable, 90624 nonremovable row versions in 4039 out of 2244646 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 288037
There were 5463 unused item pointers.
Skipped 0 pages due to buffer pins, 564917 frozen pages.
0 pages are entirely empty.
CPU: user: 10.91 s, system: 11.80 s, elapsed: 131.01 s.
INFO: vacuuming "pg_toast.pg_toast_18165"
INFO: index "pg_toast_18165_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.01 s.
INFO: "pg_toast_18165": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 288037
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.03 s.
INFO: analyzing "sui.probandset"
INFO: "probandset": scanned 30000 of 2244646 pages, containing 448480 live rows and 0 dead rows; 30000 rows in sample, 33555961 estimated total rows
VACUUM
Time: 535436.137 ms (08:55.436)
coon=# reindex table sui.segment
coon-# ;
REINDEX
Time: 681530.451 ms (11:21.530)
coon=# select count(*) from sui.segment;
count
----------
49942837
(1 row)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rob Sargent (#1)
Re: judging acceptable discrepancy in row count v. estimate

Rob Sargent <robjsargent@gmail.com> writes:

Should reality be half again as large as the estimated row count?
coon=# select count(*) from sui.segment;
count
----------
49,942,837 -- my commas
(1 row)

coon=# vacuum (analyse, verbose) sui.probandset;

Uh, what does sui.probandset have to do with sui.segment ?

regards, tom lane

#3Rob Sargent
robjsargent@gmail.com
In reply to: Tom Lane (#2)
Re: judging acceptable discrepancy in row count v. estimate

On Oct 16, 2018, at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Rob Sargent <robjsargent@gmail.com> writes:

Should reality be half again as large as the estimated row count?
coon=# select count(*) from sui.segment;
count
----------
49,942,837 -- my commas
(1 row)

coon=# vacuum (analyse, verbose) sui.probandset;

Uh, what does sui.probandset have to do with sui.segment ?

regards, tom lane

As the locals say, Oh My Heck. Nothing at all as far as row count is concern. Deepest apologies.

#4Rob Sargent
robjsargent@gmail.com
In reply to: Tom Lane (#2)
Re: judging acceptable discrepancy in row count v. estimate

On Oct 16, 2018, at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Rob Sargent <robjsargent@gmail.com> writes:

Should reality be half again as large as the estimated row count?
coon=# select count(*) from sui.segment;
count
----------
49,942,837 -- my commas
(1 row)

coon=# vacuum (analyse, verbose) sui.probandset;

Uh, what does sui.probandset have to do with sui.segment ?

regards, tom lane

In fullness,

INFO: analyzing "sui.segment"
INFO: "segment": scanned 30000 of 1019242 pages, containing 1470000 live rows and 0 dead rows; 30000 rows in sample, 49942858 estimated total rows
VACUUM
Time: 321934.748 ms (05:21.935)

So, rather accurately estimated (no inserts, deletes) since bogus report. Looks like its good to 5+ decimal places, given sufficient records.

select 49942858.0/49942837.0;
?column?
------------------------
1.00000042048071878656
(1 row)

This table has no variable length columns. I imagine that helps.