Should heapam_estimate_rel_size consider fillfactor?
Hi,
While testing some stuff, I noticed heapam_estimate_rel_size (or rather
table_block_relation_estimate_size it calls) ignores fillfactor, so that
for a table without statistics it ends up with reltuple estimate much
higher than reality. For example, with fillfactor=10 the estimate is
about 10x higher.
I ran into this while doing some tests with hash indexes, where I use
fillfactor to make the table look bigger (as if the tuples were wider),
and I ran into this:
drop table hash_test;
create table hash_test (a int, b text) with (fillfactor=10);
insert into hash_test select 1 + ((i - 1) / 10000), md5(i::text)
from generate_series(1, 1000000) s(i);
-- analyze hash_test;
create index hash_test_idx on hash_test using hash (a);
select pg_size_pretty(pg_relation_size('hash_test_idx'));
If you run it like this (without the analyze), the index will be 339MB.
With the analyze, it's 47MB.
This only happens if there are no relpages/reltuples statistics yet, in
which case table_block_relation_estimate_size estimates density from
tuple width etc.
So it seems the easiest "fix" is to do ANALYZE before creating the index
(and not after it, as I had in my scripts). But I wonder how many people
fail to realize this - it sure took me a while to realize the indexes
are too large and even longer what is causing it. I wouldn't be very
surprised if many people had bloated hash indexes after bulk loads.
So maybe we should make table_block_relation_estimate_size smarter to
also consider the fillfactor in the "no statistics" branch, per the
attached patch.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
relsize-fillfactor-fix.patchtext/x-patch; charset=UTF-8; name=relsize-fillfactor-fix.patchDownload+9-1
So maybe we should make table_block_relation_estimate_size smarter to
also consider the fillfactor in the "no statistics" branch, per the
attached patch.
I like this a lot. The reasoning is obvious, the fix is simple,it doesn't
upset any make-check-world tests, and in order to get a performance
regression we'd need a table whose fillfactor has been changed after the
data was loaded but before an analyze happens, and that's a narrow enough
case to accept.
My only nitpick is to swap
(usable_bytes_per_page * fillfactor / 100) / tuple_width
with
(usable_bytes_per_page * fillfactor) / (tuple_width * 100)
as this will eliminate the extra remainder truncation, and it also gets the
arguments "in order" algebraically.
Hi,
On 2023-06-11 14:41:27 +0200, Tomas Vondra wrote:
While testing some stuff, I noticed heapam_estimate_rel_size (or rather
table_block_relation_estimate_size it calls) ignores fillfactor, so that
for a table without statistics it ends up with reltuple estimate much
higher than reality. For example, with fillfactor=10 the estimate is
about 10x higher.I ran into this while doing some tests with hash indexes, where I use
fillfactor to make the table look bigger (as if the tuples were wider),
and I ran into this:drop table hash_test;
create table hash_test (a int, b text) with (fillfactor=10);
insert into hash_test select 1 + ((i - 1) / 10000), md5(i::text)
from generate_series(1, 1000000) s(i);
-- analyze hash_test;
create index hash_test_idx on hash_test using hash (a);select pg_size_pretty(pg_relation_size('hash_test_idx'));
If you run it like this (without the analyze), the index will be 339MB.
With the analyze, it's 47MB.This only happens if there are no relpages/reltuples statistics yet, in
which case table_block_relation_estimate_size estimates density from
tuple width etc.So it seems the easiest "fix" is to do ANALYZE before creating the index
(and not after it, as I had in my scripts). But I wonder how many people
fail to realize this - it sure took me a while to realize the indexes
are too large and even longer what is causing it. I wouldn't be very
surprised if many people had bloated hash indexes after bulk loads.So maybe we should make table_block_relation_estimate_size smarter to
also consider the fillfactor in the "no statistics" branch, per the
attached patch.
Seems like a good idea - I can't think of a reason why we shouldn't do so.
Greetings,
Andres Freund
On 14.06.23 20:41, Corey Huinker wrote:
So maybe we should make table_block_relation_estimate_size smarter to
also consider the fillfactor in the "no statistics" branch, per the
attached patch.I like this a lot. The reasoning is obvious, the fix is simple,it
doesn't upset any make-check-world tests, and in order to get a
performance regression we'd need a table whose fillfactor has been
changed after the data was loaded but before an analyze happens, and
that's a narrow enough case to accept.My only nitpick is to swap
(usable_bytes_per_page * fillfactor / 100) / tuple_width
with
(usable_bytes_per_page * fillfactor) / (tuple_width * 100)
as this will eliminate the extra remainder truncation, and it also gets
the arguments "in order" algebraically.
The fillfactor is in percent, so it makes sense to divide it by 100
first before doing any further arithmetic with it. I find your version
of this to be more puzzling without any explanation. You could do
fillfactor/100.0 to avoid the integer division, but then, the comment
above that says "integer division is intentional here", without any
further explanation. I think a bit more explanation of all the
subtleties here would be good in any case.
On 7/3/23 08:46, Peter Eisentraut wrote:
On 14.06.23 20:41, Corey Huinker wrote:
So maybe we should make table_block_relation_estimate_size smarter to
also consider the fillfactor in the "no statistics" branch, per the
attached patch.I like this a lot. The reasoning is obvious, the fix is simple,it
doesn't upset any make-check-world tests, and in order to get a
performance regression we'd need a table whose fillfactor has been
changed after the data was loaded but before an analyze happens, and
that's a narrow enough case to accept.My only nitpick is to swap
(usable_bytes_per_page * fillfactor / 100) / tuple_width
with
(usable_bytes_per_page * fillfactor) / (tuple_width * 100)
as this will eliminate the extra remainder truncation, and it also
gets the arguments "in order" algebraically.The fillfactor is in percent, so it makes sense to divide it by 100
first before doing any further arithmetic with it. I find your version
of this to be more puzzling without any explanation. You could do
fillfactor/100.0 to avoid the integer division, but then, the comment
above that says "integer division is intentional here", without any
further explanation. I think a bit more explanation of all the
subtleties here would be good in any case.
Yeah, I guess the formula should be doing (fillfactor / 100.0).
The "integer division is intentional here" comment is unrelated to this
change, it refers to the division by "tuple_width" (and yeah, it'd be
nice to have it explain why it's intentional).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 7/3/23 09:00, Tomas Vondra wrote:
On 7/3/23 08:46, Peter Eisentraut wrote:
On 14.06.23 20:41, Corey Huinker wrote:
So maybe we should make table_block_relation_estimate_size smarter to
also consider the fillfactor in the "no statistics" branch, per the
attached patch.I like this a lot. The reasoning is obvious, the fix is simple,it
doesn't upset any make-check-world tests, and in order to get a
performance regression we'd need a table whose fillfactor has been
changed after the data was loaded but before an analyze happens, and
that's a narrow enough case to accept.My only nitpick is to swap
(usable_bytes_per_page * fillfactor / 100) / tuple_width
with
(usable_bytes_per_page * fillfactor) / (tuple_width * 100)
as this will eliminate the extra remainder truncation, and it also
gets the arguments "in order" algebraically.The fillfactor is in percent, so it makes sense to divide it by 100
first before doing any further arithmetic with it. I find your version
of this to be more puzzling without any explanation. You could do
fillfactor/100.0 to avoid the integer division, but then, the comment
above that says "integer division is intentional here", without any
further explanation. I think a bit more explanation of all the
subtleties here would be good in any case.Yeah, I guess the formula should be doing (fillfactor / 100.0).
The "integer division is intentional here" comment is unrelated to this
change, it refers to the division by "tuple_width" (and yeah, it'd be
nice to have it explain why it's intentional).
FWIW the reason why the integer division is intentional is most likely
that we want "floor" semantics - if there's 10.23 rows per page, that
really means 10 rows per page.
I doubt it makes a huge difference in this particular place, considering
we're calculating the estimate from somewhat unreliable values, and then
use it for rough estimate of relation size.
But from this POV, I think it's more correct to do it "my" way:
density = (usable_bytes_per_page * fillfactor / 100) / tuple_width;
because that's doing *two* separate integer divisions, with floor
semantics. First we calculate "usable bytes" (rounded down), then
average number of rows per page (also rounded down).
Corey's formula would do just one integer division. I don't think it
makes a huge difference, though. I mean, it's just an estimate and so we
can't expect to be 100% accurate.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 7/3/23 11:40, Tomas Vondra wrote:
...
FWIW the reason why the integer division is intentional is most likely
that we want "floor" semantics - if there's 10.23 rows per page, that
really means 10 rows per page.I doubt it makes a huge difference in this particular place, considering
we're calculating the estimate from somewhat unreliable values, and then
use it for rough estimate of relation size.But from this POV, I think it's more correct to do it "my" way:
density = (usable_bytes_per_page * fillfactor / 100) / tuple_width;
because that's doing *two* separate integer divisions, with floor
semantics. First we calculate "usable bytes" (rounded down), then
average number of rows per page (also rounded down).Corey's formula would do just one integer division. I don't think it
makes a huge difference, though. I mean, it's just an estimate and so we
can't expect to be 100% accurate.
Pushed, using the formula with two divisions (as in the original patch).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 03/07/2023 20:54, Tomas Vondra wrote:
Pushed, using the formula with two divisions (as in the original patch).
I ran into an issue with this, in the case of a small fillfactor and
wide tuple width:
On v16:
postgres=# create table t (data char(900)) with (fillfactor = 10,
autovacuum_enabled=off);
CREATE TABLE
postgres=# insert into t select g from generate_series(1, 1000) g;
INSERT 0 1000
postgres=# explain select count(*) from t;
QUERY PLAN
-------------------------------------------------------------
Aggregate (cost=1025.00..1025.01 rows=1 width=8)
-> Seq Scan on t (cost=0.00..1020.00 rows=2000 width=0)
(2 rows)
On v17:
QUERY PLAN
----------------------------------------------------------
Aggregate (cost=1000.00..1000.01 rows=1 width=8)
-> Seq Scan on t (cost=0.00..1000.00 rows=1 width=0)
(2 rows)
The new estimeate is 1 row, which is bad. Didn't change the plan in this
case, but I originally saw this in a test with more rows, and the
planner would not choose a parallel scan for the query because of that.
The calculation table_block_relation_estimate_size() in this case is:
tuple_width=3604
overhead_bytes_per_tuple=28
fillfactor=10
usable_bytes_per_page=8168
density = (usable_bytes_per_page * fillfactor / 100) / tuple_width
which gets rounded down to 0.
The straightforward fix is to clamp it to 1. The executor will always
place at least one tuple on a page, regardless of fillfactor.
--
Heikki Linnakangas
Neon (https://neon.tech)
On 1/29/25 19:56, Heikki Linnakangas wrote:
On 03/07/2023 20:54, Tomas Vondra wrote:
Pushed, using the formula with two divisions (as in the original patch).
I ran into an issue with this, in the case of a small fillfactor and
wide tuple width:On v16:
postgres=# create table t (data char(900)) with (fillfactor = 10,
autovacuum_enabled=off);
CREATE TABLE
postgres=# insert into t select g from generate_series(1, 1000) g;
INSERT 0 1000
postgres=# explain select count(*) from t;
QUERY PLAN
-------------------------------------------------------------
Aggregate (cost=1025.00..1025.01 rows=1 width=8)
-> Seq Scan on t (cost=0.00..1020.00 rows=2000 width=0)
(2 rows)On v17:
QUERY PLAN
----------------------------------------------------------
Aggregate (cost=1000.00..1000.01 rows=1 width=8)
-> Seq Scan on t (cost=0.00..1000.00 rows=1 width=0)
(2 rows)The new estimeate is 1 row, which is bad. Didn't change the plan in this
case, but I originally saw this in a test with more rows, and the
planner would not choose a parallel scan for the query because of that.The calculation table_block_relation_estimate_size() in this case is:
tuple_width=3604
overhead_bytes_per_tuple=28
fillfactor=10
usable_bytes_per_page=8168
density = (usable_bytes_per_page * fillfactor / 100) / tuple_widthwhich gets rounded down to 0.
The straightforward fix is to clamp it to 1. The executor will always
place at least one tuple on a page, regardless of fillfactor.
Thanks for the report. And yeah, clamping it to 1 seems like the right
fix for this. I wonder if it's worth inventing some sort of test for
this, shouldn't be too hard I guess.
In any case, I'll take care of the fix/backpatch soon.
regards
--
Tomas Vondra
On 2/4/25 16:02, Tomas Vondra wrote:
...
Thanks for the report. And yeah, clamping it to 1 seems like the right
fix for this. I wonder if it's worth inventing some sort of test for
this, shouldn't be too hard I guess.In any case, I'll take care of the fix/backpatch soon.
Here's a proposed fix, including a simple test in the stats suite.
regards
--
Tomas Vondra
Attachments:
fillfactor-fix.patchtext/x-patch; charset=UTF-8; name=fillfactor-fix.patchDownload+76-0
On 2/4/25 17:54, Tomas Vondra wrote:
On 2/4/25 16:02, Tomas Vondra wrote:
...
Thanks for the report. And yeah, clamping it to 1 seems like the right
fix for this. I wonder if it's worth inventing some sort of test for
this, shouldn't be too hard I guess.In any case, I'll take care of the fix/backpatch soon.
Here's a proposed fix, including a simple test in the stats suite.
Here's a slightly improved fix, with a proper commit message and
simplified test case. I plan to push this once the releases get out (I
know it was stamped already, but ...).
regards
--
Tomas Vondra
Attachments:
0001-Correct-relation-size-estimate-with-low-fillfactor.patchtext/x-patch; charset=UTF-8; name=0001-Correct-relation-size-estimate-with-low-fillfactor.patchDownload+36-1
On 18/02/2025 14:01, Tomas Vondra wrote:
On 2/4/25 17:54, Tomas Vondra wrote:
On 2/4/25 16:02, Tomas Vondra wrote:
...
Thanks for the report. And yeah, clamping it to 1 seems like the right
fix for this. I wonder if it's worth inventing some sort of test for
this, shouldn't be too hard I guess.In any case, I'll take care of the fix/backpatch soon.
Here's a proposed fix, including a simple test in the stats suite.
Here's a slightly improved fix, with a proper commit message and
simplified test case. I plan to push this once the releases get out (I
know it was stamped already, but ...).
Looks good, thanks!
--
Heikki Linnakangas
Neon (https://neon.tech)
On 2/18/25 13:29, Heikki Linnakangas wrote:
On 18/02/2025 14:01, Tomas Vondra wrote:
On 2/4/25 17:54, Tomas Vondra wrote:
On 2/4/25 16:02, Tomas Vondra wrote:
...
Thanks for the report. And yeah, clamping it to 1 seems like the right
fix for this. I wonder if it's worth inventing some sort of test for
this, shouldn't be too hard I guess.In any case, I'll take care of the fix/backpatch soon.
Here's a proposed fix, including a simple test in the stats suite.
Here's a slightly improved fix, with a proper commit message and
simplified test case. I plan to push this once the releases get out (I
know it was stamped already, but ...).Looks good, thanks!
Pushed. Thanks!
--
Tomas Vondra