Unbounded (Possibly) Database Size Increase - Test Case
There has been a discussion on the general list about this area. One of
the members produced a test case for demonstrating rapid size increase.
I decided to see if I could induce similar behaviour with a more
(seemingly) benign example.
I tried this :
1) Create a table and load 100000 rows (with a primary key)
2) Run several threads update 1 row and commit (loop continously with a
rest every 100 updates or so)
3) Run 1 thread that (lazy) vacuums (every 3 minutes or so)
I ran 10 threads in 2) and saw my database grow from the initial size of
150M by about 1G per hour (I stopped my test after 5 hours @ 4.5G).
The table concerned uses a large text field... it might be instructive
to see if this is central to producing this growth (I will see if a more
conventional table design can exhibit this behaviour if anyone is keen
to know).
For those interested the test case I used can be found here :
http://homepages.slingshot.co.nz/~markir/tar/test/spin.tar.gz
regards
Mark
Mark kirkwood <markir@slingshot.co.nz> writes:
I ran 10 threads in 2) and saw my database grow from the initial size of
150M by about 1G per hour (I stopped my test after 5 hours @ 4.5G).
Which files grew exactly? (Main table, indexes, toast table, toast index?)
Was the FSM size parameter set large enough to cover the amount of space
you need the system to be able to recycle --- viz, the amount used
between vacuum runs? As with most everything else in PG, the default
value is not real large: 10000 pages = 80MB.
regards, tom lane
On Wed, 2002-05-08 at 01:45, Tom Lane wrote:
Which files grew exactly? (Main table, indexes, toast table, toast index?)
Here a listing (from another run - I dumped and reloaded before getting
any of that info last time...)
[:/data1/pgdata/7.2/base/23424803]$ du -sk .
4900806 .
-rw------- 1 postgres dba 1073741824 May 9 21:20 23424806.3
-rw------- 1 postgres dba 1073741824 May 9 21:19 23424806.2
-rw------- 1 postgres dba 1073741824 May 9 21:18 23424806.1
-rw------- 1 postgres dba 1073741824 May 9 21:16 23424806
-rw------- 1 postgres dba 124444672 May 9 21:16 23424808
-rw------- 1 postgres dba 587505664 May 9 21:14 23424806.4
-rw------- 1 postgres dba 5914624 May 9 21:05 23424804
-rw------- 1 postgres dba 2441216 May 9 21:05 23424809
These files are for :
grow=# select relname,oid
grow-# from pg_class where oid in
('23424806','23424808','23424804','23424809'); relname |
oid
-----------------------+----------
pg_toast_23424804_idx | 23424808
pg_toast_23424804 | 23424806
grow_pk | 23424809
grow | 23424804
(4 rows)
so the big guy is the toast table and index
- BTW the table design is
CREATE TABLE grow (id integer,body text,CONSTRAINT grow_pk PRIMARY KEY
(id))
The row length is big ~ 14K. I am wondering if this behaviour will "go
away" if I use recompile with a 32K page size (also seem to recall I can
tell Pg not to toast certain column types)
Was the FSM size parameter set large enough to cover the amount of space
you need the system to be able to recycle --- viz, the amount used
between vacuum runs? As with most everything else in PG, the default
value is not real large: 10000 pages = 80MB.
I thought I was generous here ...~ 960M free space map
max_fsm_relations = 100 # min 10, fsm is free space map
max_fsm_pages = 120000 # min 1000, fsm is free space map
I think I need to count how many vacuums performed during the test, so I
can work out if this amount should have been enough. I timed a vacuum
now at 12 minutes. (So with 10 concurrent threads it could take a lot
longer during the run )
regards
Mark
Mark kirkwood <markir@slingshot.co.nz> writes:
Was the FSM size parameter set large enough to cover the amount of space
you need the system to be able to recycle --- viz, the amount used
between vacuum runs? As with most everything else in PG, the default
value is not real large: 10000 pages = 80MB.
I thought I was generous here ...~ 960M free space map
max_fsm_relations = 100 # min 10, fsm is free space map
max_fsm_pages = 120000 # min 1000, fsm is free space map
I think I need to count how many vacuums performed during the test, so I
can work out if this amount should have been enough. I timed a vacuum
now at 12 minutes. (So with 10 concurrent threads it could take a lot
longer during the run )
Keep in mind also that you need enough FSM entries to keep track of
partially-full pages. To really lock things down and guarantee no
table growth you might need one FSM slot for every page in your
relations. In practice you should be able to get away with much less
than that: you certainly don't need entries for pages with no free
space, and pages with only a little free space shouldn't be worth
tracking either. But if your situation is 100% update turnover between
vacuums then you could have a worst-case situation where all the pages
have roughly 50% free space right after a vacuum, and if you fail to
track them *all* then you're probably going to see some table growth
in the next cycle.
I believe that with a more reasonable vacuum frequency (vacuum after
10% to 25% turnover, say) the FSM requirements should be a lot less.
But I have not had time to do any experimentation to arrive at a rule
of thumb for vacuum frequency vs. FSM requirements. If you or someone
could run some experiments, it'd be a big help.
regards, tom lane
On Thu, 2002-05-09 at 14:21, Mark kirkwood wrote:
On Wed, 2002-05-08 at 01:45, Tom Lane wrote:
Which files grew exactly? (Main table, indexes, toast table, toast index?)
Here a listing (from another run - I dumped and reloaded before getting
any of that info last time...)[:/data1/pgdata/7.2/base/23424803]$ du -sk .
4900806 .-rw------- 1 postgres dba 1073741824 May 9 21:20 23424806.3
-rw------- 1 postgres dba 1073741824 May 9 21:19 23424806.2
-rw------- 1 postgres dba 1073741824 May 9 21:18 23424806.1
-rw------- 1 postgres dba 1073741824 May 9 21:16 23424806
-rw------- 1 postgres dba 124444672 May 9 21:16 23424808
-rw------- 1 postgres dba 587505664 May 9 21:14 23424806.4
-rw------- 1 postgres dba 5914624 May 9 21:05 23424804
-rw------- 1 postgres dba 2441216 May 9 21:05 23424809These files are for :
grow=# select relname,oid
grow-# from pg_class where oid in
('23424806','23424808','23424804','23424809'); relname |
oid
-----------------------+----------
pg_toast_23424804_idx | 23424808
pg_toast_23424804 | 23424806
grow_pk | 23424809
grow | 23424804
(4 rows)so the big guy is the toast table and index
- BTW the table design is
CREATE TABLE grow (id integer,body text,CONSTRAINT grow_pk PRIMARY KEY
(id))
Was it not the case that lazy vacuum had problems freeing tuples that
have toasted fields ?
The row length is big ~ 14K. I am wondering if this behaviour will "go
away" if I use recompile with a 32K page size (also seem to recall I can
tell Pg not to toast certain column types)
----------
Hannu
Hannu Krosing <hannu@tm.ee> writes:
Was it not the case that lazy vacuum had problems freeing tuples that
have toasted fields ?
News to me if so.
regards, tom lane
On Sat, 2002-05-11 at 11:24, Tom Lane wrote:
Hannu Krosing <hannu@tm.ee> writes:
Was it not the case that lazy vacuum had problems freeing tuples that
have toasted fields ?News to me if so.
regards, tom lane
It looks like this may in fact be the case.
I performed a number of tests using the previous setup, but shortening the row length and
using pg_attribute.attstorage to untoast the text field for some of the tests.
The difference is striking.
The behaviour of the untoasted case is pretty much as expected :
the database grows a bit and then stabilizes at some size.
However I could not get any size stabilization in the toasted case.
Here are (some) of my test results :
Fsm Siz |Threads|Toast |Init(M)|End (M)|Stable |Stable Time(h) |Run Time(h)
20000 | 2 |Y | 166 | 380 | N | - |17
60000 | 2 |Y | 166 | 430 | N | - |20
10000 | 2 |N | 162 | 235 | Y | 0.5 |1
20000 | 2 |N | 166 | 235 | Y | 0.5 |13
60000 | 2 |N | 166 | 235 | Y | 0.5 |13
legend :
Fsm Siz = max_fsm_pages
Threads = no. update threads
Toast = whether body field was toasted
Init = initial database size
End = final database size
Stable = whether database growth had stopped
Stable Time = when stable size was achieved
Run Time = length of test run (excluding initial database population)
Average vacuum time = 300s
Typical (1 thread) entire table update time = 2000s
Row length = 7.5K
The scripts I used are here :
http://homepages.slingshot.co.nz/~markir/tar/test/spin.tar.gz
At this point I am wondering about sending this in as a bug report - what do you think ?
regards,
Mark
Mark kirkwood <markir@slingshot.co.nz> writes:
However I could not get any size stabilization in the toasted case.
Hmm. Which file(s) were growing, exactly? How many row updates is this
run covering?
I'd rather expect the toast indexes to grow given the lack-of-btree-
collapse-logic issue. However, the rate of growth ought to be pretty
tiny --- much less than the amount of data being pumped through, for
sure.
regards, tom lane
Hannu Krosing <hannu@tm.ee> writes:
On Sun, 2002-05-19 at 19:37, Tom Lane wrote:
I'd rather expect the toast indexes to grow given the lack-of-btree-
collapse-logic issue.
Why sould the toast indexes grow significantly more than the primary key
of main table ?
Well, the toast indexes will grow because they're using an OID key,
and so the range of indexed values keeps increasing. AFAIR Mark didn't
say whether he *had* a primary key, let alone what it was --- but it's
possible that he has one that has a range that's not changing over the
test.
In particular, if the test consists simply of updating the toasted
field, that will not change the primary keys at all ... but it will
change the toast table's key range, because each new value will get
a new toast OID.
regards, tom lane
Import Notes
Reply to msg id not found: 1021904879.14280.224.camel@taru.tm.ee
On Sun, 2002-05-19 at 19:37, Tom Lane wrote:
Mark kirkwood <markir@slingshot.co.nz> writes:
However I could not get any size stabilization in the toasted case.
Hmm. Which file(s) were growing, exactly? How many row updates is this
run covering?I'd rather expect the toast indexes to grow given the lack-of-btree-
collapse-logic issue.
Why sould the toast indexes grow significantly more than the primary key
of main table ?
However, the rate of growth ought to be pretty
tiny --- much less than the amount of data being pumped through, for
sure.
----------
Hannu
Hannu Krosing <hannu@tm.ee> writes:
But does PG not have a new index entry for each _version_ of table row ?
Sure, but the entries do go away during vacuum.
Or does lack-of-btree-collapse-logic affect only keys where there are
many _different_ keys and not many repeating keys?
The problem is that once the btree is constructed, the key ranges
assigned to the existing leaf pages can't grow, only shorten due
to page splits. So if you've got, say,
1 2 3 | 4 5 6 | 7 8 9
(schematically suggesting 3 leaf pages with 9 keys) and you delete
keys 1-3 and vacuum, you now have
- - - | 4 5 6 | 7 8 9
Lots of free space in leaf page 1, but that doesn't help you when you
want to insert keys 10, 11, 12. That leaf page can only be used for
keys <= 3, or possibly <= 4, depending on what boundary key is shown
in the next btree level. So if you reinsert rows with the same range
of keys as you had before, no index growth. If the range of keys
moves, new pages will keep getting added on at the right end of the
btree. Old pages at the left end will never go away, even if they
become mostly or entirely empty.
AFAICS we cannot fix this except by reverse-splitting adjacent index
pages when their combined usage falls below some threshold. (The
reverse split would give us one unused page that could be put in a
freelist and then used somewhere else in the index structure.)
In principle VACUUM could do this, but it's ticklish to code, especially
given the desire not to acquire exclusive locks while vacuuming.
regards, tom lane
Import Notes
Reply to msg id not found: 1021909168.14280.285.camel@taru.tm.ee
On Mon, 2002-05-20 at 16:08, Tom Lane wrote:
Hannu Krosing <hannu@tm.ee> writes:
On Sun, 2002-05-19 at 19:37, Tom Lane wrote:
I'd rather expect the toast indexes to grow given the lack-of-btree-
collapse-logic issue.Why sould the toast indexes grow significantly more than the primary key
of main table ?Well, the toast indexes will grow because they're using an OID key,
and so the range of indexed values keeps increasing. AFAIR Mark didn't
say whether he *had* a primary key, let alone what it was --- but it's
possible that he has one that has a range that's not changing over the
test.
his table is this:
CREATE TABLE grow (id integer,body text,CONSTRAINT grow_pk PRIMARY KEY (id))
In particular, if the test consists simply of updating the toasted
field, that will not change the primary keys at all ... but it will
change the toast table's key range, because each new value will get
a new toast OID.
But does PG not have a new index entry for each _version_ of table row ?
Or does lack-of-btree-collapse-logic affect only keys where there are
many _different_ keys and not many repeating keys?
--------------
Hannu
Hmm. Which file(s) were growing, exactly? How many row updates is this
run covering?
The toast table gets about 90 percent of the growth, followed by the toast
index at about 9 percent. The actual table + primary key stay at about 2M each.
I neglected to mention what the update statement actually was :
UPDATE grow SET body = ? WHERE id = ?
So the untoasted elements are not being altered at all...
A typical run has 2 threads each of which updates the entire table (20,000
rows) every 2000 s.
The vacuum thread manages to get 6-7 vacuums in before both threads update the
entire table.
regards
Mark
Import Notes
Resolved by subject fallback
<markir@slingshot.co.nz> writes:
The toast table gets about 90 percent of the growth, followed by the toast
index at about 9 percent. The actual table + primary key stay at about 2M each.
Odd. I wonder whether you are looking at an unintended behavior of the
free space map's thresholding mechanism. The toast table will generally
have large tuples of consistent size (about 2K each). This will cause
the FSM threshold for whether to remember a page to approach 2K, which
probably will mean that we forget about pages that could still hold one
toast tuple. That might be enough to cause the growth. It may be
worth playing around with the details of the threshold-setting policy.
In particular, I'd suggest altering the code in GetPageWithFreeSpace
and RecordAndGetPageWithFreeSpace (in
src/backend/storage/freespace/freespace.c) to make the threshold
converge towards something less than the average request size, perhaps
average/2, which you could do with
- cur_avg += ((int) spaceNeeded - cur_avg) / 32;
+ cur_avg += (((int) spaceNeeded)/2 - cur_avg) / 32;
Possibly the initial threshold set in create_fsm_rel also needs to be
smaller than it is. Not sure about that though.
Let me know how that affects your results ...
regards, tom lane
Import Notes
Reply to msg id not found: 200205202139.g4KLdFW02976@sss.pgh.pa.usReference msg id not found: 200205202139.g4KLdFW02976@sss.pgh.pa.us | Resolved by subject fallback
On Tue, 21 May 2002 11:10:04 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
Odd. I wonder whether you are looking at an unintended behavior of the
free space map's thresholding mechanism. The toast table will generally
have large tuples of consistent size (about 2K each).
So we have 4 tuples per page?
This will cause
the FSM threshold for whether to remember a page to approach 2K, which
probably will mean that we forget about pages that could still hold one
toast tuple.
I thought I was able to follow you up to here.
That might be enough to cause the growth.
Here I'm lost. The effect you mention explains growth up to a state
where each toast table page holds 3 instead of 4 tuples (1.33 *
initial size). Now with each UPDATE we get pages with significantly
more free space than 2K. Even if we add a few 1.000 pages being added
before the next VACUUM, we still reach a stable size. Of course this
only holds if there are enough FSM slots, which Mark claims to have.
So IMHO there have to be additional reasons causing *unbounded*
growth. Or am I missing something?
Just my 0.02.
Servus
Manfred
Manfred Koizar <mkoi-pg@aon.at> writes:
Here I'm lost. The effect you mention explains growth up to a state
where each toast table page holds 3 instead of 4 tuples (1.33 *
initial size). Now with each UPDATE we get pages with significantly
more free space than 2K.
Good point, it should still stabilize with at worst 33% overhead. So
maybe I'm barking up the wrong tree.
Still, the FSM code is new in 7.2 and I'm quite prepared to believe that
the effect Mark is seeing indicates some problem in it. Anyone care to
sit down and read through freespace.c? It's pretty liberally commented.
regards, tom lane
On Wed, 2002-05-22 at 03:10, Tom Lane wrote:
(snippage) That might be enough to cause the growth. It may be
worth playing around with the details of the threshold-setting policy.
(snippage)
Possibly the initial threshold set in create_fsm_rel also needs to be
smaller than it is. Not sure about that though.Let me know how that affects your results ...
I will try some changes out here (and look at freespace.c in general) -
but dont let that stop anyone else examining it as well... :-)
(I am on holiday for a 10 days as of 24/05, so I may not report anything
for a little while)
regards
Mark
Import Notes
Resolved by subject fallback