Memory usage during vacuum

Started by Shelby Cainabout 22 years ago10 messagesgeneral
Jump to latest
#1Shelby Cain
alyandon@yahoo.com

Version: PostgreSQL 7.4.1 on i686-pc-cygwin, compiled
by GCC gcc (GCC) 3.3.1 (cygming special)

postgresql.conf settings:

tcpip_socket = true
max_connections = 16
shared_buffers = 2048 # min 16, at least
max_connections*2, 8KB each
sort_mem = 2048 # min 64, size in KB
vacuum_mem = 8192 # min 1024, size in KB
wal_buffers = 16 # min 4, 8KB each
checkpoint_segments = 9 # in logfile segments, min 1,
16MB each
effective_cache_size = 3000 # typically 8KB each
random_page_cost = 2 # units are one
sequential page fetch cost
cpu_index_tuple_cost = 0.0001 # 0.001(same)
default_statistics_target = 300 # range 1-1000
log_timestamp = true
stats_start_collector = true
stats_command_string = true
stats_block_level = true
stats_row_level = true
stats_reset_on_server_start = false

This is on a workstation so I've purposely limited the
amount of memory that would be used. I would have
assumed that some combination of (shared_buffers*8 +
vacuum_mem) plus a little overhead would be in the
neighborhood of the maximum amount of memory used
during the vacuum analyze process. However, I've
noticed that when I hit some very large tables the
backend's memory usage will soar to as high as 100+
megs. I'm trying to keep postgresql's memory usage
under 40 megs under all conditions so that other
services/applications don't grind to a halt due to
swapping. Is there any way to achieve my goal?

Regards,

Shelby Cain

__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Shelby Cain (#1)
Re: Memory usage during vacuum

Shelby Cain <alyandon@yahoo.com> writes:

I'm trying to keep postgresql's memory usage
under 40 megs under all conditions so that other
services/applications don't grind to a halt due to
swapping. Is there any way to achieve my goal?

Don't use VACUUM FULL. The vacuum_mem setting only limits the space
consumed by plain VACUUM --- VACUUM FULL needs to keep track of all the
free space in the table, and will eat as much memory as it has to to do
that.

regards, tom lane

#3Uwe C. Schroeder
uwe@oss4u.com
In reply to: Tom Lane (#2)
Re: Memory usage during vacuum

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

How about plugging in more memory ?
40MB seems a bit low for a database server footprint - well, certainly depends
on what you do.
But if your machine starts swapping with an extra 40 MB of memory consumption
I'd say the machine is undersized for the application. I usually have around
500 MB free memory with everything running. Memory is cheap nowadays...

On Thursday 25 March 2004 08:15 am, Tom Lane wrote:

Shelby Cain <alyandon@yahoo.com> writes:

I'm trying to keep postgresql's memory usage
under 40 megs under all conditions so that other
services/applications don't grind to a halt due to
swapping. Is there any way to achieve my goal?

Don't use VACUUM FULL. The vacuum_mem setting only limits the space
consumed by plain VACUUM --- VACUUM FULL needs to keep track of all the
free space in the table, and will eat as much memory as it has to to do
that.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

- --
UC

- --
Open Source Solutions 4U, LLC 2570 Fleetwood Drive
Phone: +1 650 872 2425 San Bruno, CA 94066
Cell: +1 650 302 2405 United States
Fax: +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAYw22jqGXBvRToM4RAghbAKCbXZ9avDIMwpxOyo3g+iyoTmJNSQCgkk3n
2a8HrY9gxBNMk/2iwLrnHEA=
=TZob
-----END PGP SIGNATURE-----

#4Shelby Cain
alyandon@yahoo.com
In reply to: Tom Lane (#2)
Re: Memory usage during vacuum

I apologize for my original post being unclear. I'm
running "vacuum analyze" and seeing the behavior
mentioned. Does specifying the analyze option imply
"vacuum full"?

On a hunch I just ran analyze <really big table> and
the backend's memory usage soared up to 100+ megs. I
suspect that means it isn't the vacuum but the analyze
that is eating all my precious ram. :)

Any tips on minimizing the memory footprint during
analyze (ie: backing off the 300 setting that I'm
currently using) or is this just something I'll have
to live with?

Regards,

Shelby Cain

--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

Don't use VACUUM FULL. The vacuum_mem setting only
limits the space
consumed by plain VACUUM --- VACUUM FULL needs to
keep track of all the
free space in the table, and will eat as much memory
as it has to to do
that.

regards, tom lane

__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

#5Shelby Cain
alyandon@yahoo.com
In reply to: Uwe C. Schroeder (#3)
Re: Memory usage during vacuum

I agree in principle that the solution is to run on a
server with more memory instead of my local
development box. However, I'm not going to be able to
simply request that additional memory be installed as
these are "standard" boxes that IT distributes to
employees.

Regardless, I'm more curious about whether I was
overlookign a setting that could reduce the memory
footprint during a vacuum analyze cycle than about
getting it reduced. If it becomes a major pain I'll
simply run the thing on off hours while I'm not at
work. :)

Regards,

Shelby Cain

--- "Uwe C. Schroeder" <uwe@oss4u.com> wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

How about plugging in more memory ?
40MB seems a bit low for a database server footprint
- well, certainly depends
on what you do.
But if your machine starts swapping with an extra 40
MB of memory consumption
I'd say the machine is undersized for the
application. I usually have around
500 MB free memory with everything running. Memory
is cheap nowadays...

On Thursday 25 March 2004 08:15 am, Tom Lane wrote:

Shelby Cain <alyandon@yahoo.com> writes:

I'm trying to keep postgresql's memory usage
under 40 megs under all conditions so that other
services/applications don't grind to a halt due

to

swapping. Is there any way to achieve my goal?

Don't use VACUUM FULL. The vacuum_mem setting

only limits the space

consumed by plain VACUUM --- VACUUM FULL needs to

keep track of all the

free space in the table, and will eat as much

memory as it has to to do

that.

regards, tom lane

---------------------------(end of

broadcast)---------------------------

TIP 9: the planner will ignore your desire to

choose an index scan if your

joining column's datatypes do not match

- --
UC

- --
Open Source Solutions 4U, LLC 2570 Fleetwood Drive
Phone: +1 650 872 2425 San Bruno, CA 94066
Cell: +1 650 302 2405 United States
Fax: +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAYw22jqGXBvRToM4RAghbAKCbXZ9avDIMwpxOyo3g+iyoTmJNSQCgkk3n

2a8HrY9gxBNMk/2iwLrnHEA=
=TZob
-----END PGP SIGNATURE-----

__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Shelby Cain (#4)
Re: Memory usage during vacuum

Shelby Cain <alyandon@yahoo.com> writes:

I apologize for my original post being unclear. I'm
running "vacuum analyze" and seeing the behavior
mentioned. Does specifying the analyze option imply
"vacuum full"?

No; I just figured you were probably using FULL without saying so.
However ...

On a hunch I just ran analyze <really big table> and
the backend's memory usage soared up to 100+ megs.

... the issue is obviously ANALYZE and not VACUUM at all. What
statistics targets are you using? It's hard to believe ANALYZE
would eat that much space unless it's being asked for a really
large target.

regards, tom lane

#7Uwe C. Schroeder
uwe@oss4u.com
In reply to: Shelby Cain (#5)
Re: Memory usage during vacuum

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thursday 25 March 2004 09:12 am, Shelby Cain wrote:

I agree in principle that the solution is to run on a
server with more memory instead of my local
development box. However, I'm not going to be able to
simply request that additional memory be installed as
these are "standard" boxes that IT distributes to
employees.

Too bad that.

Regardless, I'm more curious about whether I was
overlookign a setting that could reduce the memory
footprint during a vacuum analyze cycle than about
getting it reduced. If it becomes a major pain I'll
simply run the thing on off hours while I'm not at
work. :)

That's what I would do. Depending on how many transactions (inserts/deletes)
you do a day, it might be perfectly fine to just run a vacuum full analyze
every night via cron. You can reduce the per connection settings (max
connections allowed as well as shared buffers), but that may have a rather
significant impact on performance. So it's probably best to leave those
settings alone (when developing you don't want to wait a minute per query)
and rather do the cleanup at night, or maybe an additional one during lunch
(if needed).

Regards,

Shelby Cain

--- "Uwe C. Schroeder" <uwe@oss4u.com> wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

How about plugging in more memory ?
40MB seems a bit low for a database server footprint
- well, certainly depends
on what you do.
But if your machine starts swapping with an extra 40
MB of memory consumption
I'd say the machine is undersized for the
application. I usually have around
500 MB free memory with everything running. Memory
is cheap nowadays...

On Thursday 25 March 2004 08:15 am, Tom Lane wrote:

Shelby Cain <alyandon@yahoo.com> writes:

I'm trying to keep postgresql's memory usage
under 40 megs under all conditions so that other
services/applications don't grind to a halt due

to

swapping. Is there any way to achieve my goal?

Don't use VACUUM FULL. The vacuum_mem setting

only limits the space

consumed by plain VACUUM --- VACUUM FULL needs to

keep track of all the

free space in the table, and will eat as much

memory as it has to to do

that.

regards, tom lane

---------------------------(end of

broadcast)---------------------------

TIP 9: the planner will ignore your desire to

choose an index scan if your

joining column's datatypes do not match

- --
UC

- --
Open Source Solutions 4U, LLC 2570 Fleetwood Drive
Phone: +1 650 872 2425 San Bruno, CA 94066
Cell: +1 650 302 2405 United States
Fax: +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAYw22jqGXBvRToM4RAghbAKCbXZ9avDIMwpxOyo3g+iyoTmJNSQCgkk3n

2a8HrY9gxBNMk/2iwLrnHEA=
=TZob
-----END PGP SIGNATURE-----

__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

- --
UC

- --
Open Source Solutions 4U, LLC 2570 Fleetwood Drive
Phone: +1 650 872 2425 San Bruno, CA 94066
Cell: +1 650 302 2405 United States
Fax: +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAYxazjqGXBvRToM4RAh1RAKCV9ozDim5gBWYdX+VBdyiT8Ych1QCgyqZb
Eo6UT9r2K2z6TE1jQpg/PEU=
=4ogO
-----END PGP SIGNATURE-----

#8Shelby Cain
alyandon@yahoo.com
In reply to: Tom Lane (#6)
Re: Memory usage during vacuum

Currently my default is 300 (yes - very large I know)
but overriding default_statistics_target with a value
of 1 and re-running vacuum analyze on the same large
table results in no change in maximum memory
consumption during the process that I can see. It
should be noted that I see this behavior only with one
table. All other tables, even relatively large ones
only result in a backend that grows to consume about
30 megabytes.

Here is some sample output:

c1scain=# set default_statistics_target=300;
SET
c1scain=# vacuum verbose analyze inventory_txns;
INFO: vacuuming "public.inventory_txns"
INFO: index "idx_inventory_txns_txndate" now contains
957655 row versions in 5770 pages
DETAIL: 1686 index pages have been deleted, 1686 are
currently reusable.
CPU 0.51s/0.21u sec elapsed 5.20 sec.
INFO: index "idx_inventory_txns_locitemnbr" now
contains 957655 row versions in 4752 pages
DETAIL: 0 index pages have been deleted, 0 are
currently reusable.
CPU 0.53s/0.09u sec elapsed 2.73 sec.
INFO: "inventory_txns": found 0 removable, 957655
nonremovable row versions in 39549 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 15914 unused item pointers.
0 pages are entirely empty.
CPU 2.16s/0.60u sec elapsed 15.76 sec.
INFO: analyzing "public.inventory_txns"
INFO: "inventory_txns": 39549 pages, 150000 rows
sampled, 944729 estimated total rows

I notice that postgresql decided to sample 150000
rows.

Now, when I try the following:

c1scain=# set default_statistics_target=1;
SET
c1scain=# vacuum verbose analyze inventory_txns;
INFO: vacuuming "public.inventory_txns"
INFO: index "idx_inventory_txns_txndate" now contains
957655 row versions in 5770 pages
DETAIL: 1686 index pages have been deleted, 1686 are
currently reusable.
CPU 0.37s/0.15u sec elapsed 5.29 sec.
INFO: index "idx_inventory_txns_locitemnbr" now
contains 957655 row versions in 4752 pages
DETAIL: 0 index pages have been deleted, 0 are
currently reusable.
CPU 0.54s/0.09u sec elapsed 2.93 sec.
INFO: "inventory_txns": found 0 removable, 957655
nonremovable row versions in 39549 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 15914 unused item pointers.
0 pages are entirely empty.
CPU 2.21s/0.51u sec elapsed 16.37 sec.
INFO: analyzing "public.inventory_txns"
INFO: "inventory_txns": 39549 pages, 150000 rows
sampled, 944729 estimated total rows
VACUUM

It still decided to sample 150000 rows. Am I missing
something obvious here? Shouldn't fewer rows be
sampled when I set the collection target to 1?

Regards,

Shelby Cain

--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

... the issue is obviously ANALYZE and not VACUUM at
all. What
statistics targets are you using? It's hard to
believe ANALYZE
would eat that much space unless it's being asked
for a really
large target.

__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Shelby Cain (#8)
Re: Memory usage during vacuum

Shelby Cain <alyandon@yahoo.com> writes:

It still decided to sample 150000 rows. Am I missing
something obvious here? Shouldn't fewer rows be
sampled when I set the collection target to 1?

The sample size is 300 rows times the largest per-column analysis
target, where default_statistics_target is used if the recorded
per-column setting is -1. I would say that you have set a target of 500
for at least one of the columns of that table, using ALTER TABLE SET
STATISTICS. Try this to see which:

select attname, attstattarget from pg_attribute
where attrelid = 'table_name_here'::regclass;

regards, tom lane

#10Shelby Cain
alyandon@yahoo.com
In reply to: Tom Lane (#9)
Re: Memory usage during vacuum

I had thought that I had dropped and reloaded this
table but apparently I hadn't and I had set the
statistics target for one column to 500 while
experimenting. Resetting it to -1 and running with a
default of 300 gets ~ 70 megs memory footprint during
the analyze now.

Thanks Tom for indulging my curiosity on the matter.
I've learned something that I didn't readily pick up
from reading the documentation.

Regards,

Shelby Cain

--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

Shelby Cain <alyandon@yahoo.com> writes:

It still decided to sample 150000 rows. Am I

missing

something obvious here? Shouldn't fewer rows be
sampled when I set the collection target to 1?

The sample size is 300 rows times the largest
per-column analysis
target, where default_statistics_target is used if
the recorded
per-column setting is -1. I would say that you have
set a target of 500
for at least one of the columns of that table, using
ALTER TABLE SET
STATISTICS. Try this to see which:

select attname, attstattarget from pg_attribute
where attrelid = 'table_name_here'::regclass;

regards, tom lane

__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html