abnormal data grow

Started by Reynard Hilmanover 21 years ago11 messagesgeneral
Jump to latest
#1Reynard Hilman
reynardmh@lightsky.com

Hi,

I have been having this problem where the database size suddenly grows
from the normal size of about 300Mb to 12Gb in one night.
When I look up the table size, the biggest one is only 41Mb and the
total of all table size is only 223Mb.
But in the filesystem data directory the total size is 12Gb. I noticed
there are 10 files with 1Gb size each:
1.1G 25677563
1.1G 25677563.1
1.1G 25677563.2
1.1G 25677563.3
1.1G 25677563.4
1.1G 25677563.5
1.1G 25677563.6
1.1G 25677563.7
1.1G 25677563.8
1.1G 25677563.9

So I tried to find out what table that is, but this query return nothing:
SELECT relname, relpages
FROM pg_class
WHERE relname = 'pg_toast_25677563' OR relname =
'pg_toast_25677563_index'
ORDER BY relname;

and if I run this query:
SELECT relname, relpages
FROM pg_class ORDER BY relpages desc

the top ones are these tables (which is not the id of the 1.1 Gb files)
pg_toast_25677561 1382845
pg_toast_25677561_index 22116

I'm just wondering is there a way to know what that 25677563 file is?
Why does postgres create a copy of that file with .1, .2, .3, etc. Is
that file still in used (because I can't find it in the pg_class table)?
Will postgres clean the database up or should I do a manual clean?

- reynard

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Reynard Hilman (#1)
Re: abnormal data grow

Reynard Hilman <reynardmh@lightsky.com> writes:

But in the filesystem data directory the total size is 12Gb. I noticed
there are 10 files with 1Gb size each:
1.1G 25677563
1.1G 25677563.1
...

I'm just wondering is there a way to know what that 25677563 file is?

select relname from pg_class where relfilenode = 25677563;

regards, tom lane

#3Doug McNaught
doug@mcnaught.org
In reply to: Reynard Hilman (#1)
Re: abnormal data grow

Reynard Hilman <reynardmh@lightsky.com> writes:

Hi,

I have been having this problem where the database size suddenly grows
from the normal size of about 300Mb to 12Gb in one night.
When I look up the table size, the biggest one is only 41Mb and the
total of all table size is only 223Mb.
But in the filesystem data directory the total size is 12Gb. I noticed
there are 10 files with 1Gb size each:
1.1G 25677563
1.1G 25677563.1
1.1G 25677563.2
1.1G 25677563.3
1.1G 25677563.4
1.1G 25677563.5
1.1G 25677563.6
1.1G 25677563.7
1.1G 25677563.8
1.1G 25677563.9

25677563 is the OID of this object. The different files are segments
of the relation, not duplicates (PG restricts individual data file
size to 1GB). So look for which relation has that OID--it's quite
possibly an index that is bloating up.

-Doug
--
Let us cross over the river, and rest under the shade of the trees.
--T. J. Jackson, 1863

#4Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Reynard Hilman (#1)
Re: abnormal data grow

On Tue, Sep 21, 2004 at 09:51:15AM -0500, Reynard Hilman wrote:

I'm just wondering is there a way to know what that 25677563 file is?
Why does postgres create a copy of that file with .1, .2, .3, etc.

Those are not copies. Postgres splits each relation (table/index) in 1GB
files. So the relation actually uses 10 GB; you need some cleanup.

Is that file still in used (because I can't find it in the pg_class
table)?

Yes. Don't delete it manually.

Will postgres clean the database up or should I do a manual clean?

You probably need a VACUUM; or, if it's an index, a REINDEX.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Granting software the freedom to evolve guarantees only different results,
not better ones." (Zygo Blaxell)

#5Reynard Hilman
reynardmh@lightsky.com
In reply to: Alvaro Herrera (#4)
Re: abnormal data grow

Is that file still in used (because I can't find it in the pg_class
table)?

Yes. Don't delete it manually.

Thanks for that advice :) I wasn't really going to delete it, just
tempted to.

following Tom's advice, this query:
select relname from pg_class where relfilenode = 25677563;
returns pg_toast_25677561 which is the record in pg_class that has the biggest relpages, so that makes sense.

I'll try running reindex tonight then, hopefully that does some clean up.

thanks for all the advice,
- reynard

#6Pierre-Frédéric Caillaud
lists@boutiquenumerique.com
In reply to: Reynard Hilman (#5)
gist index woes

Trying to build a gist index on a column in a table.
The table contains 100k rows.
The column is an integer[]. Each row contains about 20-30 distinct values
chosen between 1 and 437.
Aim : search the arrays with the gist integer array operators @ etc.

Creating the index with gist__int_ops takes forever and, after something
like one hour, fills the disk to the brim. There is about 4G free space on
the partition and it eats it completely.

Doing the same with only 10k rows takes forever too. Bumping up sort_mem
to 128 Meg does nothing. Only way to make it work is to create the index
on an empty table, and insert the rows afterwards. I only tried 10K rows
as disk space consumption is alarming.

Search speed in this 10k rows using the index is a lot slower (10x) than
sequential scan.

A problem with this contrib module ?

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Reynard Hilman (#5)
Re: abnormal data grow

Reynard Hilman <reynardmh@lightsky.com> writes:

following Tom's advice, this query:
select relname from pg_class where relfilenode = 25677563;
returns pg_toast_25677561 which is the record in pg_class that has the biggest relpages, so that makes sense.

Okay, so you have a lot of wide (toasted) fields in whatever table that
toast table belongs to --- if you're not sure, try

select relname from pg_class where
reltoastrelid = (select oid from pg_class where relfilenode = 25677563);

VACUUM VERBOSE on that table would give some useful info.

regards, tom lane

#8Reynard Hilman
reynardmh@lightsky.com
In reply to: Tom Lane (#7)
Re: abnormal data grow

Okay, so you have a lot of wide (toasted) fields in whatever table that
toast table belongs to --- if you're not sure, try

select relname from pg_class where
reltoastrelid = (select oid from pg_class where relfilenode = 25677563);

VACUUM VERBOSE on that table would give some useful info.

regards, tom lane

Thank you Tom! This query finds the culprit right away. There is one
table that acts as a log file, storing the transactions sql that's
happening on the db, so the db size grows when the logging is turned on.
After doing some cleaning on this table, it shrinks the database to
500Mb again.
Just curious though, why the toast file for this transaction table takes
the most space (the transaction table itself was only about 10Mb),
considering there are a few other tables with bigger size (40Mb) than
this transaction table but do not have toast file that's comparable in
size to this one.

- reynard

#9Oleg Bartunov
oleg@sai.msu.su
In reply to: Pierre-Frédéric Caillaud (#6)
Re: gist index woes

Have you tried gist__intbig_ops ?

Oleg
On Tue, 21 Sep 2004, [iso-8859-15] Pierre-Fr?d?ric Caillaud wrote:

Trying to build a gist index on a column in a table.
The table contains 100k rows.
The column is an integer[]. Each row contains about 20-30 distinct values
chosen between 1 and 437.
Aim : search the arrays with the gist integer array operators @ etc.

Creating the index with gist__int_ops takes forever and, after something
like one hour, fills the disk to the brim. There is about 4G free space on
the partition and it eats it completely.

Doing the same with only 10k rows takes forever too. Bumping up sort_mem
to 128 Meg does nothing. Only way to make it work is to create the index
on an empty table, and insert the rows afterwards. I only tried 10K rows
as disk space consumption is alarming.

Search speed in this 10k rows using the index is a lot slower (10x) than
sequential scan.

A problem with this contrib module ?

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#10Pierre-Frédéric Caillaud
lists@boutiquenumerique.com
In reply to: Oleg Bartunov (#9)
Re: gist index woes

Thanks for the hint, it did work.
The table weights 270 MB, the index weights 580 MB, but searches are fast.

Show quoted text

Have you tried gist__intbig_ops ?

Oleg
On Tue, 21 Sep 2004, [iso-8859-15] Pierre-Fr?d?ric Caillaud wrote:

Trying to build a gist index on a column in a table.
The table contains 100k rows.
The column is an integer[]. Each row contains about 20-30 distinct
values
chosen between 1 and 437.
Aim : search the arrays with the gist integer array operators @ etc.

Creating the index with gist__int_ops takes forever and, after
something
like one hour, fills the disk to the brim. There is about 4G free space
on
the partition and it eats it completely.

Doing the same with only 10k rows takes forever too. Bumping up
sort_mem
to 128 Meg does nothing. Only way to make it work is to create the index
on an empty table, and insert the rows afterwards. I only tried 10K rows
as disk space consumption is alarming.

Search speed in this 10k rows using the index is a lot slower (10x)
than
sequential scan.

A problem with this contrib module ?

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

#11Ed L.
pgsql@bluepolka.net
In reply to: Reynard Hilman (#8)
Re: abnormal data grow

On Tuesday September 21 2004 12:40, Reynard Hilman wrote:

Okay, so you have a lot of wide (toasted) fields in whatever table that
toast table belongs to ...

Just curious though, why the toast file for this transaction table takes
the most space (the transaction table itself was only about 10Mb),
considering there are a few other tables with bigger size (40Mb) than
this transaction table but do not have toast file that's comparable in
size to this one.

Not sure this is your problem, but when I have tables whose on-disk size is
significantly larger than the size reported from contrib's dbsize.sql
queries (if that's what you're using), I begin to suspect diskspace
"leakage" due to insufficient postgresql.conf settings for max_fsm_pages
and max_fsm_relations. Those settings need to be bigger than the max
amount of diskspace and number of relations that you'll ever use, or you
may experience this leakage, requiring a "vacuum full" or dump/reload
(alias downtime) to repair.

Ed