[patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
Please add the attached patch and this discussion to the open commit fest. The
original bugs thread is here: 20180111111254.1408.8342@wrigleys.postgresql.org.
Bug reference: 15005
Logged by: David Gould
Email address: daveg@sonic.net
PostgreSQL version: 10.1 and earlier
Operating system: Linux
Description:
ANALYZE can make pg_class.reltuples wildly inaccurate compared to the actual
row counts for tables that are larger than the default_statistics_target.
Example from one of a clients production instances:
# analyze verbose pg_attribute;
INFO: analyzing "pg_catalog.pg_attribute"
INFO: "pg_attribute": scanned 30000 of 24519424 pages, containing 6475 live rows and 83 dead rows; 6475 rows in sample, 800983035 estimated total rows.
This is a large complex database - pg_attribute actually has about five
million rows and needs about one hundred thouand pages. However it has
become extremely bloated and is taking 25 million pages (192GB!), about 250
times too much. This happened despite aggressive autovacuum settings and a
periodic bloat monitoring script. Since pg_class.reltuples was 800 million,
the bloat monitoring script did not detect that this table was bloated and
autovacuum did not think it needed vacuuming.
When reltuples is very large compared to the actual row count it causes a
number of problems:
- Bad input to the query planner.
- Prevents autovacuum from processing large bloated tables because
autovacuum_scale_factor * reltuples is large enough the threshold is
rarely reached.
- Decieves bloat checking tools that rely on the relationship of relpages
to reltuples*average_row_size.
I've tracked down how this happens and created a reproduction script and a
patch. Attached:
- analyze_reltuples_bug-v1.patch Patch against master
- README.txt Instructions for testing
- analyze_reltuples_bug.sql Reproduction script
- analyze_counts.awk Helper for viewing results of test
- test_standard.txt Test output for unpatched postgresql 10.1
- test_patched.txt Test output with patch
The patch applies cleanly, with some offsets, to 9.4.15, 9.5.10, 9.6.6 and 10.1.
Note that this is not the same as the reltuples calculation bug discussed in the
thread at 16db4468-edfa-830a-f921-39a50498e77e@2ndquadrant.com. That one is
mainly concerned with vacuum, this with analyze. The two bugs do amplify each
other though.
Analysis:
---------
Analyze and vacuum calculate the new value for pg_class.reltuples in
vacuum.c:vac_estimate_reltuples():
old_density = old_rel_tuples / old_rel_pages;
new_density = scanned_tuples / scanned_pages;
multiplier = (double) scanned_pages / (double) total_pages;
updated_density = old_density + (new_density - old_density) * multiplier;
return floor(updated_density * total_pages + 0.5);
The comments talk about the difference between VACUUM and ANALYZE and explain
that VACUUM probably only scanned changed pages so the density of the scanned
pages is not representative of the rest of the unchanged table. Hence the new
overall density of the table should be adjusted proportionaly to the scanned
pages vs total pages. Which makes sense. However despite the comment noteing
that ANALYZE and VACUUM are different, the code actually does the same
calculation for both.
The problem is that it dilutes the impact of ANALYZE on reltuples for large
tables:
- For a table of 3000000 pages an analyze can only change the reltuples
value by 1%.
- When combined with changes in relpages due to bloat the new computed
reltuples can end up far from reality.
Reproducing the reltuples analyze estimate bug.
-----------------------------------------------
The script "reltuples_analyze_bug.sql" creates a table that is large
compared to the analyze sample size and then repeatedly updates about
10% of it followed by an analyze each iteration. The bug is that the
calculation analyze uses to update pg_class.reltuples will tend to
increase each time even though the actual rowcount does not change.
To run:
Given a postgresql 10.x server with >= 1GB of shared buffers:
createdb test
psql --no-psqlrc -f analyze_reltuples_bug.sql test > test_standard.out 2>&1
awk -f analyze_counts.awk test_standard.out
To verify the fix, restart postgres with a patched binary and repeat
the above.
Here are the results with an unpatched server:
After 10 interations of:
update 10% of rows;
analyze
reltuples has almost doubled.
/ estimated rows / / pages / /sampled rows/
relname current proposed total scanned live dead
reltuples_test 10000001 10000055 153847 3000 195000 0
reltuples_test 10981367 9951346 169231 3000 176410 18590
reltuples_test 11948112 10039979 184615 3000 163150 31850
reltuples_test 12900718 10070666 200000 3000 151060 43940
reltuples_test 13835185 9739305 215384 3000 135655 59345
reltuples_test 14758916 9864947 230768 3000 128245 66755
reltuples_test 15674572 10138631 246153 3000 123565 71435
reltuples_test 16576847 9910944 261537 3000 113685 81315
reltuples_test 17470388 10019961 276922 3000 108550 86450
reltuples_test 18356707 10234607 292306 3000 105040 89960
reltuples_test 19228409 9639927 307690 3000 93990 101010
-dg
--
David Gould daveg@sonic.net
If simplicity worked, the world would be overrun with insects.
Attachments:
analyze_reltuples_bug-v1.patchtext/x-patchDownload
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index cbd6e9b161..ebf03de45f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -766,10 +766,12 @@ vacuum_set_xid_limits(Relation rel,
*
* If we scanned the whole relation then we should just use the count of
* live tuples seen; but if we did not, we should not trust the count
- * unreservedly, especially not in VACUUM, which may have scanned a quite
- * nonrandom subset of the table. When we have only partial information,
- * we take the old value of pg_class.reltuples as a measurement of the
- * tuple density in the unscanned pages.
+ * unreservedly, we have only partial information. VACUUM in particular
+ * may have scanned a quite nonrandom subset of the table, so we take
+ * the old value of pg_class.reltuples as a measurement of the tuple
+ * density in the unscanned pages. However, ANALYZE promises that we
+ * scanned a representative random sample of the table so we should use
+ * the new density directly.
*
* This routine is shared by VACUUM and ANALYZE.
*/
@@ -791,45 +793,39 @@ vac_estimate_reltuples(Relation relation, bool is_analyze,
return scanned_tuples;
/*
- * If scanned_pages is zero but total_pages isn't, keep the existing value
- * of reltuples. (Note: callers should avoid updating the pg_class
- * statistics in this situation, since no new information has been
- * provided.)
+ * If scanned_pages is zero, keep the existing value of reltuples.
+ * (Note: callers should avoid updating the pg_class statistics in
+ * this situation, since no new information has been provided.)
*/
if (scanned_pages == 0)
return old_rel_tuples;
/*
+ * For ANALYZE, the newly observed density in the pages scanned is
+ * based on a representative sample of the whole table and can be
+ * used as-is.
+ */
+ new_density = scanned_tuples / scanned_pages;
+ if (is_analyze)
+ return floor(new_density * total_pages + 0.5);
+
+ /*
* If old value of relpages is zero, old density is indeterminate; we
- * can't do much except scale up scanned_tuples to match total_pages.
+ * can't do much except use the new_density to scale up scanned_tuples
+ * to match total_pages.
*/
if (old_rel_pages == 0)
- return floor((scanned_tuples / scanned_pages) * total_pages + 0.5);
+ return floor(new_density * total_pages + 0.5);
/*
- * Okay, we've covered the corner cases. The normal calculation is to
- * convert the old measurement to a density (tuples per page), then update
- * the density using an exponential-moving-average approach, and finally
- * compute reltuples as updated_density * total_pages.
- *
- * For ANALYZE, the moving average multiplier is just the fraction of the
- * table's pages we scanned. This is equivalent to assuming that the
- * tuple density in the unscanned pages didn't change. Of course, it
- * probably did, if the new density measurement is different. But over
- * repeated cycles, the value of reltuples will converge towards the
- * correct value, if repeated measurements show the same new density.
- *
- * For VACUUM, the situation is a bit different: we have looked at a
- * nonrandom sample of pages, but we know for certain that the pages we
- * didn't look at are precisely the ones that haven't changed lately.
- * Thus, there is a reasonable argument for doing exactly the same thing
- * as for the ANALYZE case, that is use the old density measurement as the
- * value for the unscanned pages.
- *
- * This logic could probably use further refinement.
+ * For VACUUM, the situation is different: we have looked at a nonrandom
+ * sample of pages, and we know that the pages we didn't look at are
+ * the ones that haven't changed lately. Thus, we use the old density
+ * measurement for the unscanned pages and combine it with the observed
+ * new density scaled by the ratio of scanned to unscanned pages.
*/
+
old_density = old_rel_tuples / old_rel_pages;
- new_density = scanned_tuples / scanned_pages;
multiplier = (double) scanned_pages / (double) total_pages;
updated_density = old_density + (new_density - old_density) * multiplier;
return floor(updated_density * total_pages + 0.5);
Hi David,
I was able to reproduce the problem using your script.
analyze_counts.awk is missing, though.
The idea of using the result of ANALYZE as-is, without additional
averaging, was discussed when vac_estimate_reltuples() was introduced
originally. Ultimately, it was decided not to do so. You can find the
discussion in this thread:
/messages/by-id/BANLkTinL6QuAm_Xf8teRZboG2Mdy3dR_vw@mail.gmail.com
The core problem here seems to be that this calculation of moving
average does not converge in your scenario. It can be shown that when
the number of live tuples is constant and the number of pages grows, the
estimated number of tuples will increase at each step. Do you think we
can use some other formula that would converge in this scenario, but
still filter the noise in ANALYZE results? I couldn't think of one yet.
--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Wed, 28 Feb 2018 15:55:19 +0300
Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> wrote:
Hi David,
I was able to reproduce the problem using your script.
analyze_counts.awk is missing, though.
Attached now I hope. I think I also added it to the commitfest page.
The idea of using the result of ANALYZE as-is, without additional
averaging, was discussed when vac_estimate_reltuples() was introduced
originally. Ultimately, it was decided not to do so. You can find the
discussion in this thread:
/messages/by-id/BANLkTinL6QuAm_Xf8teRZboG2Mdy3dR_vw@mail.gmail.com
Well that was a long discussion. I'm not sure I would agree that there was a
firm conclusion on what to do about ANALYZE results. There was some
recognition that the case of ANALYZE is different than VACUUM and that is
reflected in the original code comments too. However the actual code ended up
being the same for both ANALYZE and VACUUM. This patch is about that.
See messages:
/messages/by-id/BANLkTimVhdO_bKQagRsH0OLp7MxgJZDryg@mail.gmail.com
/messages/by-id/BANLkTimaDj950K-298JW09RrmG0eJ_C=qQ@mail.gmail.com
/messages/by-id/28116.1306609295@sss.pgh.pa.us
The core problem here seems to be that this calculation of moving
average does not converge in your scenario. It can be shown that when
the number of live tuples is constant and the number of pages grows, the
estimated number of tuples will increase at each step. Do you think we
can use some other formula that would converge in this scenario, but
still filter the noise in ANALYZE results? I couldn't think of one yet.
Besides the test data generated with the script I have parsed the analyze
verbose output for several large production systems running complex
applications and have found that for tables larger than the statistics
sample size (300*default_statistics_target) the row count you can caculate
from (pages/sample_pages) * live_rows is pretty accurate, within a few
percent of the value from count(*).
In theory the sample pages analyze uses should represent the whole table
fairly well. We rely on this to generate pg_statistic and it is a key
input to the planner. Why should we not believe in it as much only for
reltuples? If the analyze sampling does not work, the fix would be to improve
that, not to disregard it piecemeal.
My motivation is that I have seen large systems fighting mysterious run-away
bloat for years no matter how aggressively autovacuum is tuned. The fact that
an inflated reltuples can cause autovacuum to simply ignore tables forever
seems worth fixing.
-dg
--
David Gould daveg@sonic.net
If simplicity worked, the world would be overrun with insects.
Attachments:
On 01.03.2018 06:23, David Gould wrote:
In theory the sample pages analyze uses should represent the whole table
fairly well. We rely on this to generate pg_statistic and it is a key
input to the planner. Why should we not believe in it as much only for
reltuples? If the analyze sampling does not work, the fix would be to improve
that, not to disregard it piecemeal.
Well, that sounds reasonable. But the problem with the moving average
calculation remains. Suppose you run vacuum and not analyze. If the
updates are random enough, vacuum won't be able to reclaim all the
pages, so the number of pages will grow. Again, we'll have the same
thing where the number of pages grows, the real number of live tuples
stays constant, and the estimated reltuples grows after each vacuum run.
I did some more calculations on paper to try to understand this. If we
average reltuples directly, instead of averaging tuple density, it
converges like it should. The error with this density calculation seems
to be that we're effectively multiplying the old density by the new
number of pages. I'm not sure why we even work with tuple density. We
could just estimate the number of tuples based on analyze/vacuum, and
then apply moving average to it. The calculations would be shorter, too.
What do you think?
--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> writes:
On 01.03.2018 06:23, David Gould wrote:
In theory the sample pages analyze uses should represent the whole table
fairly well. We rely on this to generate pg_statistic and it is a key
input to the planner. Why should we not believe in it as much only for
reltuples? If the analyze sampling does not work, the fix would be to improve
that, not to disregard it piecemeal.
Well, that sounds reasonable. But the problem with the moving average
calculation remains. Suppose you run vacuum and not analyze. If the
updates are random enough, vacuum won't be able to reclaim all the
pages, so the number of pages will grow. Again, we'll have the same
thing where the number of pages grows, the real number of live tuples
stays constant, and the estimated reltuples grows after each vacuum run.
You claimed that before, with no more evidence than this time, and I still
don't follow your argument. The number of pages may indeed bloat but the
number of live tuples per page will fall. Ideally, at least, the estimate
would remain on-target. If it doesn't, there's some other effect that
you haven't explained. It doesn't seem to me that the use of a moving
average would prevent that from happening. What it *would* do is smooth
out errors from the inevitable sampling bias in any one vacuum or analyze
run, and that seems like a good thing.
I did some more calculations on paper to try to understand this. If we
average reltuples directly, instead of averaging tuple density, it
converges like it should. The error with this density calculation seems
to be that we're effectively multiplying the old density by the new
number of pages. I'm not sure why we even work with tuple density. We
could just estimate the number of tuples based on analyze/vacuum, and
then apply moving average to it. The calculations would be shorter, too.
What do you think?
I think you're reinventing the way we used to do it. Perhaps consulting
the git history in the vicinity of this code would be enlightening.
regards, tom lane
On 01.03.2018 18:09, Tom Lane wrote:
Ideally, at least, the estimate would remain on-target.
The test shows that under this particular scenario the estimated number
of tuples grows after each ANALYZE. I tried to explain how this happens
in the attached pdf. The direct averaging of the number of tuples, not
using the density, doesn't have this problem, so I suppose it could help.
I think you're reinventing the way we used to do it. Perhaps consulting
the git history in the vicinity of this code would be enlightening.
I see that before vac_estimate_reltuples was introduced, the results of
analyze and vacuum were used directly, without averaging. What I am
suggesting is to use a different way of averaging, not to remove it.
--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
On Thu, 1 Mar 2018 17:25:09 +0300
Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> wrote:
Well, that sounds reasonable. But the problem with the moving average
calculation remains. Suppose you run vacuum and not analyze. If the
updates are random enough, vacuum won't be able to reclaim all the
pages, so the number of pages will grow. Again, we'll have the same
thing where the number of pages grows, the real number of live tuples
stays constant, and the estimated reltuples grows after each vacuum run.
I agree VACUUM's moving average may be imperfect, but the rationale makes
sense and I don't have a plan to improve it now. This patch only intends to
improve the behavior of ANALYZE by using the estimated row density time
relpages to get reltuples. It does not change VACUUM.
The problem with the moving average for ANALYZE is that it prevents ANALYZE
from changing the reltuples estimate enough for large tables.
Consider this based on the test setup from the patch:
create table big as select id*p, ten, hun, thou, tenk, lahk, meg, padding
from reltuples_test,
generate_series(0,9) g(p);
-- SELECT 100000000
alter table big set (autovacuum_enabled=false);
select count(*) from big;
-- count
-- 100000000
select reltuples::int, relpages from pg_class where relname = 'big';
-- reltuples | relpages
-- 0 | 0
analyze verbose big;
-- INFO: analyzing "public.big"
-- INFO: "big": scanned 30000 of 1538462 pages, containing 1950000 live rows and 0 dead rows;
-- 30000 rows in sample, 100000030 estimated total rows
select reltuples::int, relpages from pg_class where relname = 'big';
-- reltuples | relpages
-- 100000032 | 1538462
delete from big where ten > 1;
-- DELETE 80000000
select count(*) from big;
-- count
-- 20000000
select reltuples::int, relpages from pg_class where relname = 'big';
-- reltuples | relpages
-- 100000032 | 1538462
analyze verbose big;
-- INFO: analyzing "public.big"
-- INFO: "big": scanned 30000 of 1538462 pages, containing 388775 live rows and 1561225 dead rows;
-- 30000 rows in sample, 98438807 estimated total rows
select reltuples::int, relpages from pg_class where relname = 'big';
reltuples | relpages
98438808 | 1538462
select count(*) from big;
-- count
-- 20000000
analyze verbose big;
-- INFO: analyzing "public.big"
-- INFO: "big": scanned 30000 of 1538462 pages, containing 390885 live rows and 1559115 dead rows;
-- 30000 rows in sample, 96910137 estimated total rows
select reltuples::int, relpages from pg_class where relname = 'big';
reltuples | relpages
96910136 | 1538462
Table big has 1.5 million pages. ANALYZE samples 30 thousand. No matter how
many rows we change in T, ANALYZE can only change the reltuples estimate
by old_estimate + new_estimate * (30000/1538462), ie about 1.9 percent.
With the patch on this same table we get:
select count(*) from big;
-- count
-- 20000000
select reltuples::int, relpages from pg_class where relname = 'big';
reltuples | relpages
96910136 | 1538462
analyze verbose big;
-- INFO: analyzing "public.big"
-- INFO: "big": scanned 30000 of 1538462 pages, containing 390745 live rows and 1559255 dead rows;
-- 30000 rows in sample, 20038211 estimated total rows
select reltuples::int, relpages from pg_class where relname = 'big';
-- reltuples | relpages
-- 20038212 | 1538462
-dg
--
David Gould daveg@sonic.net
If simplicity worked, the world would be overrun with insects.
Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> writes:
On 01.03.2018 18:09, Tom Lane wrote:
Ideally, at least, the estimate would remain on-target.
The test shows that under this particular scenario the estimated number
of tuples grows after each ANALYZE. I tried to explain how this happens
in the attached pdf.
I looked at this and don't think it really answers the question. What
happens is that, precisely because we only slowly adapt our estimate of
density towards the new measurement, we will have an overestimate of
density if the true density is decreasing (even if the new measurement is
spot-on), and that corresponds exactly to an overestimate of reltuples.
No surprise there. The question is why it fails to converge to reality
over time.
I think part of David's point is that because we only allow ANALYZE to
scan a limited number of pages even in a very large table, that creates
an artificial limit on the slew rate of the density estimate; perhaps
what's happening in his tables is that the true density is dropping
faster than that limit allows us to adapt. Still, if there's that
much going on in his tables, you'd think VACUUM would be touching
enough of the table that it would keep the estimate pretty sane.
So I don't think we yet have a convincing explanation of why the
estimates drift worse over time.
Anyway, I find myself semi-persuaded by his argument that we are
already assuming that ANALYZE has taken a random sample of the table,
so why should we not believe its estimate of density too? Aside from
simplicity, that would have the advantage of providing a way out of the
situation when the existing reltuples estimate has gotten drastically off.
The sticking point in my mind right now is, if we do that, what to do with
VACUUM's estimates. If you believe the argument in the PDF that we'll
necessarily overshoot reltuples in the face of declining true density,
then it seems like that argument applies to VACUUM as well. However,
VACUUM has the issue that we should *not* believe that it looked at a
random sample of pages. Maybe the fact that it looks exactly at the
changed pages causes it to see something less than the overall density,
cancelling out the problem, but that seems kinda optimistic.
Anyway, as I mentioned in the 2011 thread, the existing computation is
isomorphic to the rule "use the old density estimate for the pages we did
not look at, and the new density estimate --- ie, exactly scanned_tuples
--- for the pages we did look at". That still has a lot of intuitive
appeal, especially for VACUUM where there's reason to believe those page
populations aren't alike. We could recast the code to look like it's
doing that rather than doing a moving-average, although the outcome
should be the same up to roundoff error.
regards, tom lane
On 02.03.2018 02:49, Tom Lane wrote:
I looked at this and don't think it really answers the question. What
happens is that, precisely because we only slowly adapt our estimate of
density towards the new measurement, we will have an overestimate of
density if the true density is decreasing (even if the new measurement is
spot-on), and that corresponds exactly to an overestimate of reltuples.
No surprise there. The question is why it fails to converge to reality
over time.
The calculation I made for the first step applies to the next steps too,
with minor differences. So, the estimate increases at each step. Just
out of interest, I plotted the reltuples for 60 steps, and it doesn't
look like it's going to converge anytime soon (see attached).
Looking at the formula, this overshoot term is created when we multiply
the old density by the new number of pages. I'm not sure how to fix
this. I think we could average the number of tuples, not the densities.
The attached patch demonstrates what I mean.
--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
reltuples-avg.patchtext/x-patch; name=reltuples-avg.patchDownload
*** /tmp/DqhRGF_vacuum.c 2018-03-02 18:43:54.448046402 +0300
--- src/backend/commands/vacuum.c 2018-03-02 18:22:04.223070206 +0300
***************
*** 780,791 ****
BlockNumber scanned_pages,
double scanned_tuples)
{
- BlockNumber old_rel_pages = relation->rd_rel->relpages;
double old_rel_tuples = relation->rd_rel->reltuples;
! double old_density;
! double new_density;
double multiplier;
- double updated_density;
/* If we did scan the whole table, just use the count as-is */
if (scanned_pages >= total_pages)
--- 780,788 ----
BlockNumber scanned_pages,
double scanned_tuples)
{
double old_rel_tuples = relation->rd_rel->reltuples;
! double new_rel_tuples;
double multiplier;
/* If we did scan the whole table, just use the count as-is */
if (scanned_pages >= total_pages)
***************
*** 801,839 ****
return old_rel_tuples;
/*
! * If old value of relpages is zero, old density is indeterminate; we
! * can't do much except scale up scanned_tuples to match total_pages.
*/
! if (old_rel_pages == 0)
! return floor((scanned_tuples / scanned_pages) * total_pages + 0.5);
/*
! * Okay, we've covered the corner cases. The normal calculation is to
! * convert the old measurement to a density (tuples per page), then update
! * the density using an exponential-moving-average approach, and finally
! * compute reltuples as updated_density * total_pages.
! *
! * For ANALYZE, the moving average multiplier is just the fraction of the
! * table's pages we scanned. This is equivalent to assuming that the
! * tuple density in the unscanned pages didn't change. Of course, it
! * probably did, if the new density measurement is different. But over
! * repeated cycles, the value of reltuples will converge towards the
! * correct value, if repeated measurements show the same new density.
! *
! * For VACUUM, the situation is a bit different: we have looked at a
! * nonrandom sample of pages, but we know for certain that the pages we
! * didn't look at are precisely the ones that haven't changed lately.
! * Thus, there is a reasonable argument for doing exactly the same thing
! * as for the ANALYZE case, that is use the old density measurement as the
! * value for the unscanned pages.
! *
! * This logic could probably use further refinement.
*/
- old_density = old_rel_tuples / old_rel_pages;
- new_density = scanned_tuples / scanned_pages;
multiplier = (double) scanned_pages / (double) total_pages;
! updated_density = old_density + (new_density - old_density) * multiplier;
! return floor(updated_density * total_pages + 0.5);
}
--- 798,825 ----
return old_rel_tuples;
/*
! * Estimate the total number of tuples based on the density of scanned
! * tuples.
*/
! new_rel_tuples = floor((scanned_tuples / scanned_pages) * total_pages + 0.5);
/*
! * ANALYZE scans a representative subset of pages, so we trust its density
! * estimate.
! */
! if (is_analyze)
! return new_rel_tuples;
!
! /*
! * VACUUM scanned a nonrandom sample of pages, so we can't just scale up its
! * result. For the portion of table it didn't scan, use the old number of tuples,
! * and for the portion it did scan, use the number it reported. This is
! * effectively an exponential moving average with adaptive factor.
*/
multiplier = (double) scanned_pages / (double) total_pages;
! return floor(old_rel_tuples * (1. - multiplier)
! + new_rel_tuples * multiplier
! + 0.5);
}
analyze.pngimage/png; name=analyze.pngDownload
�PNG
IHDR �d�? 5PLTE������???___����������� |�@��������������������������p��`��@��@�� �`��`��@��@�� Uk/�P@���� ������ �� �������k�����z��r�E ����P�� ��� � ����� � �p � �.�W"�"