VACUUM, 24/7 availability and 7.2
I'm doing some work for a smallish company which conducts
its business largely online. Currently they have a legacy
mishmash of Oracle and MySQL databases which they wish
to unify one one platform (RDBMS with client access via
browser and custom serverside applications for employees
and customers).
PostgreSQL would be my primary candidate. However the company's
operating requirments mean that the data needed for interaction
with customers / website users must be available on a 24/7 basis.
This is primarily a) data related to product ordering and
tables for storing order data; and b) website user authentication
and personalisation data (logins, user preferences etc).
It is therefore not an option to have these databases offline
at regular intervals for any significant length of time for
VACUUMing. Replicating data to say MySQL databases is
technically feasible, at least in the case of b) above, but
not desirable. Are there any existing "native" PostgreSQL solutions
to this problem?
More importantly, what is the situation on VACUUM for release 7.2?
It seems from the pgsql-hackers list that there are plans for
a none-exclusively locking VACUUM, e.g.:
(sorry about the long URL); how far advanced are they, and is
there any kind of release schedule for 7.2?
Any answers (or pointers thereto, haven't found any myself :-()
much appreciated
Ian Barwick
--
Remove SUNGLASSES to reply ;-)
Ian Barwick <SUNGLASSESbarwick@gmx.net> writes:
PostgreSQL would be my primary candidate. However the company's
operating requirments mean that the data needed for interaction
with customers / website users must be available on a 24/7 basis.
This is primarily a) data related to product ordering and
tables for storing order data; and b) website user authentication
and personalisation data (logins, user preferences etc).It is therefore not an option to have these databases offline
at regular intervals for any significant length of time for
VACUUMing. Replicating data to say MySQL databases is
technically feasible, at least in the case of b) above, but
not desirable. Are there any existing "native" PostgreSQL solutions
to this problem?
VACUUM doesn't require taking the DB offline. It does lock tables
while it's vacuuming them, which may or may not cause problems
depending on how long the vacuum takes. You seem to understand this;
I'm just correcting your terminology. ;)
You can reduce the time taken by VACUUM by running it more often,
which may make it a non-issue for you. Depends entirely on table size
and user activity.
More importantly, what is the situation on VACUUM for release 7.2?
It seems from the pgsql-hackers list that there are plans for
a none-exclusively locking VACUUM, e.g.:
As I understand it:
* Non-intrusive VACUUM will be in 7.2
* 7.2 is about to enter beta (this week hopefully)
I would say it's likely that beta will last two or three months based
on my experience with the 7.1 beta period.
-Doug
--
Let us cross over the river, and rest under the shade of the trees.
--T. J. Jackson, 1863
Import Notes
Reply to msg id not found: IanBarwick'smessageofWed10Oct2001132756+0200
Just to keep things in perspective, how large are your current databases, and
what do you or the company consider to be a signficant length of time? Right
now I have a development database with just a few thousand records of test data,
and vacuum takes just a very few seconds a day. I think I recall hearing on
this list of it taking a minute or three for databases several gigabytes in
size. For some sites this would be tolerable, for others it wouldn't.
I'm also interested to hear what the future holds for vacuum. If nothing else,
it couldn't hurt postgresql's public relations. :-)
--Wes Sheldahl
Ian Barwick <SUNGLASSESbarwick%gmx.net@interlock.lexmark.com> on 10/10/2001
07:27:56 AM
To: pgsql-general%postgresql.org@interlock.lexmark.com
cc: (bcc: Wesley Sheldahl/Lex/Lexmark)
Subject: [GENERAL] VACUUM, 24/7 availability and 7.2
I'm doing some work for a smallish company which conducts
its business largely online. Currently they have a legacy
mishmash of Oracle and MySQL databases which they wish
to unify one one platform (RDBMS with client access via
browser and custom serverside applications for employees
and customers).
PostgreSQL would be my primary candidate. However the company's
operating requirments mean that the data needed for interaction
with customers / website users must be available on a 24/7 basis.
This is primarily a) data related to product ordering and
tables for storing order data; and b) website user authentication
and personalisation data (logins, user preferences etc).
It is therefore not an option to have these databases offline
at regular intervals for any significant length of time for
VACUUMing. Replicating data to say MySQL databases is
technically feasible, at least in the case of b) above, but
not desirable. Are there any existing "native" PostgreSQL solutions
to this problem?
More importantly, what is the situation on VACUUM for release 7.2?
It seems from the pgsql-hackers list that there are plans for
a none-exclusively locking VACUUM, e.g.:
(sorry about the long URL); how far advanced are they, and is
there any kind of release schedule for 7.2?
Any answers (or pointers thereto, haven't found any myself :-()
much appreciated
Ian Barwick
--
Remove SUNGLASSES to reply ;-)
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
Import Notes
Resolved by subject fallback
On Wed, 10 Oct 2001 wsheldah@lexmark.com wrote:
Just to keep things in perspective, how large are your current databases, and
what do you or the company consider to be a signficant length of time? Right
now I have a development database with just a few thousand records of test data,
and vacuum takes just a very few seconds a day. I think I recall hearing on
this list of it taking a minute or three for databases several gigabytes in
size. For some sites this would be tolerable, for others it wouldn't.
The runtime seems to be dependent on what hind of activity you do. If you
frequently rollback inserts, vacuum seems to take longer. Say, a database
with 1m records which endures 5m inserts + rollback is going to take a
while to vacuum.
Just my informal observations, not based on structured testing.
-jwb
and vacuum takes just a very few seconds a day. I think I recall hearing
on
this list of it taking a minute or three for databases several gigabytes
in
size. For some sites this would be tolerable, for others it wouldn't.
It depends more on user activity and number/size of indexes than over all
database size from what I've seen.. In one database I have 20ish tables,
some have 70,000ish records but the whole database isn't very big -- VACUUM
takes a while because of the amount of UPDATE'd and DELETE'd records I have
every day, and the number of indexes (lots!)....
I'm also interested to hear what the future holds for vacuum. If nothing
else,
Show quoted text
it couldn't hurt postgresql's public relations. :-)
--Wes Sheldahl
On Wed, Oct 10, 2001 at 04:22:07PM -0400, wsheldah@lexmark.com wrote:
Just to keep things in perspective, how large are your current databases, and
what do you or the company consider to be a signficant length of time? Right
now I have a development database with just a few thousand records of test data,
and vacuum takes just a very few seconds a day. I think I recall hearing on
this list of it taking a minute or three for databases several gigabytes in
size. For some sites this would be tolerable, for others it wouldn't.
We are having some trouble with some tables in which we have lots of
update's (and insert/delete's). "A lot" being several thousands per day
(I haven't measured the exact numbers recently). VACUUM is running twice
a day and locks these tables a long time where 10-15 minutes is not
exceptional. This table has only approx 100k records, but these are
updated very often and that seems to cause to much rubbish for vacuum
I'm also interested to hear what the future holds for vacuum. If nothing else,
it couldn't hurt postgresql's public relations. :-)
indeed. we are currently considering moving the guilty tables over til
MySQL as we're not using any advanced features in it, while keeping all
other data in PostgreSQL. Anything that keeps us from doing that would
be good PR :)
/erwin
--Wes Sheldahl
Ian Barwick <SUNGLASSESbarwick%gmx.net@interlock.lexmark.com> on 10/10/2001
07:27:56 AMTo: pgsql-general%postgresql.org@interlock.lexmark.com
cc: (bcc: Wesley Sheldahl/Lex/Lexmark)
Subject: [GENERAL] VACUUM, 24/7 availability and 7.2I'm doing some work for a smallish company which conducts
its business largely online. Currently they have a legacy
mishmash of Oracle and MySQL databases which they wish
to unify one one platform (RDBMS with client access via
browser and custom serverside applications for employees
and customers).PostgreSQL would be my primary candidate. However the company's
operating requirments mean that the data needed for interaction
with customers / website users must be available on a 24/7 basis.
This is primarily a) data related to product ordering and
tables for storing order data; and b) website user authentication
and personalisation data (logins, user preferences etc).It is therefore not an option to have these databases offline
at regular intervals for any significant length of time for
VACUUMing. Replicating data to say MySQL databases is
technically feasible, at least in the case of b) above, but
not desirable. Are there any existing "native" PostgreSQL solutions
to this problem?More importantly, what is the situation on VACUUM for release 7.2?
It seems from the pgsql-hackers list that there are plans for
a none-exclusively locking VACUUM, e.g.:(sorry about the long URL); how far advanced are they, and is
there any kind of release schedule for 7.2?Any answers (or pointers thereto, haven't found any myself :-()
much appreciatedIan Barwick
--
Remove SUNGLASSES to reply ;-)
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?http://archives.postgresql.org
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
--
Erwin Lansing -- http://droso.org
I love deadlines. -- Douglas Adams
I love the whooshing sound they make as the fly by.
On Wed, Oct 10, 2001 at 04:22:07PM -0400, wsheldah@lexmark.com wrote:
Just to keep things in perspective, how large are your current databases, and
what do you or the company consider to be a signficant length of time? Right
now I have a development database with just a few thousand records of test data,
and vacuum takes just a very few seconds a day. I think I recall hearing on
this list of it taking a minute or three for databases several gigabytes in
size. For some sites this would be tolerable, for others it wouldn't.
We are having some trouble with some tables in which we have lots of
update's (and insert/delete's). "A lot" being several thousands per day
(I haven't measured the exact numbers recently). VACUUM is running twice
a day and locks these tables a long time where 10-15 minutes is not
exceptional. This table has only approx 100k records, but these are
updated very often and that seems to cause to much rubbish for vacuum
I'm also interested to hear what the future holds for vacuum. If nothing else,
it couldn't hurt postgresql's public relations. :-)
indeed. we are currently considering moving the guilty tables over til
MySQL as we're not using any advanced features in it, while keeping all
other data in PostgreSQL. Anything that keeps us from doing that would
be good PR :)
/erwin
--Wes Sheldahl
Ian Barwick <SUNGLASSESbarwick%gmx.net@interlock.lexmark.com> on 10/10/2001
07:27:56 AMTo: pgsql-general%postgresql.org@interlock.lexmark.com
cc: (bcc: Wesley Sheldahl/Lex/Lexmark)
Subject: [GENERAL] VACUUM, 24/7 availability and 7.2I'm doing some work for a smallish company which conducts
its business largely online. Currently they have a legacy
mishmash of Oracle and MySQL databases which they wish
to unify one one platform (RDBMS with client access via
browser and custom serverside applications for employees
and customers).PostgreSQL would be my primary candidate. However the company's
operating requirments mean that the data needed for interaction
with customers / website users must be available on a 24/7 basis.
This is primarily a) data related to product ordering and
tables for storing order data; and b) website user authentication
and personalisation data (logins, user preferences etc).It is therefore not an option to have these databases offline
at regular intervals for any significant length of time for
VACUUMing. Replicating data to say MySQL databases is
technically feasible, at least in the case of b) above, but
not desirable. Are there any existing "native" PostgreSQL solutions
to this problem?More importantly, what is the situation on VACUUM for release 7.2?
It seems from the pgsql-hackers list that there are plans for
a none-exclusively locking VACUUM, e.g.:(sorry about the long URL); how far advanced are they, and is
there any kind of release schedule for 7.2?Any answers (or pointers thereto, haven't found any myself :-()
much appreciatedIan Barwick
--
Remove SUNGLASSES to reply ;-)
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?http://archives.postgresql.org
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
--
Erwin Lansing -- http://droso.org
I love deadlines. -- Douglas Adams
I love the whooshing sound they make as the fly by.
The runtime seems to be dependent on what hind of activity you do. If you
frequently rollback inserts, vacuum seems to take longer. Say, a database
with 1m records which endures 5m inserts + rollback is going to take a
while to vacuum.
As another data point, daily vacuums on our database take 25 to 30 minutes.
This is on a dual PIII 933Mhz with 2GB RAM and the database is on a 4 Drive
RAID 10 array (stripe + mirror) of 15K RPM SCSI drives. Our database is
9GB. I don't really have a good estimate as to the number of insertions and
updates that occur daily. I have some mass updates that I hold until the
weekend. PostgreSQL is 7.03. I see vacuum downtime as the number 1 drawback
to PostgreSQL.
where in the postgres dox can we find info abt the vacuum feature?
thx...
On Wed, 10 Oct 2001, Ian Barwick wrote:
I'm doing some work for a smallish company which conducts
its business largely online. Currently they have a legacy
mishmash of Oracle and MySQL databases which they wish
to unify one one platform (RDBMS with client access via
browser and custom serverside applications for employees
and customers).PostgreSQL would be my primary candidate. However the company's
operating requirments mean that the data needed for interaction
with customers / website users must be available on a 24/7 basis.
This is primarily a) data related to product ordering and
tables for storing order data; and b) website user authentication
and personalisation data (logins, user preferences etc).It is therefore not an option to have these databases offline
at regular intervals for any significant length of time for
VACUUMing. Replicating data to say MySQL databases is
technically feasible, at least in the case of b) above, but
not desirable. Are there any existing "native" PostgreSQL solutions
to this problem?More importantly, what is the situation on VACUUM for release 7.2?
It seems from the pgsql-hackers list that there are plans for
a none-exclusively locking VACUUM, e.g.:(sorry about the long URL); how far advanced are they, and is
there any kind of release schedule for 7.2?Any answers (or pointers thereto, haven't found any myself :-()
much appreciated
There is a faster, non-exclusive-locking VACUUM in the CVS now; this
should become part of 7.2. You can download the nightly snapsot and build
it to test it with your application.
HTH.
More importantly, what is the situation on VACUUM for release 7.2?
It seems from the pgsql-hackers list that there are plans for
a none-exclusively locking VACUUM, e.g.:(sorry about the long URL); how far advanced are they,
It's long since done.
and is there any kind of release schedule for 7.2?
Beta release is ... um ... any day now. I think we're just waiting on
Lockhart to say he's done tweaking the datetime datatypes.
regards, tom lane
Erwin Lansing <erwin@lansing.dk> writes:
We are having some trouble with some tables in which we have lots of
update's (and insert/delete's). "A lot" being several thousands per day
(I haven't measured the exact numbers recently). VACUUM is running twice
a day and locks these tables a long time where 10-15 minutes is not
exceptional. This table has only approx 100k records, but these are
updated very often and that seems to cause to much rubbish for vacuum
You might try running VACUUM much more often, as it'll run faster with
less work to do. If running it every hour takes the locked time down
to 30 seconds, you might find that acceptable (depending on the
application of course).
Worth a try until 7.2 is solid...
-Doug
--
Let us cross over the river, and rest under the shade of the trees.
--T. J. Jackson, 1863
Import Notes
Reply to msg id not found: ErwinLansing'smessageofWed10Oct2001231128+0200
More importantly, what is the situation on VACUUM for release 7.2?
It seems from the pgsql-hackers list that there are plans for
a none-exclusively locking VACUUM, e.g.:(sorry about the long URL); how far advanced are they,
It's long since done.
==>> This means that it will not be included in 7.2? I've read 7.2
documentation on line and i've seen that the VACUUM command is changed:
now, when run in normal mode (giving to the backend the VACUUM command
without any parameter), the tables don't need to be locked and also that
the command does not minimize the space of the database (as instead the
actual 7.1.3 VACUUM does). From the documentation:
-----
Plain VACUUM (without FULL) simply reclaims space and makes it available
for re-use. This form of the command can operate in parallel with normal
reading and writing of the table. VACUUM FULL does more extensive
processing, including moving of tuples across blocks to try to compact the
table to the minimum number of disk blocks. This form is much slower and
requires an exclusive lock on each table while it is being processed.
-----
This way to do is similar (but not equal) to the LAZY VACUUM specified by
Tom Lane in the above link...
In conclusion, the new VACUUM command as described above will be include in
the 7.2 version of Postgresql?
Denis Gasparin: denis@edistar.com
---------------------------
Programmer & System Administrator - Edistar srl
Import Notes
Resolved by subject fallback
More importantly, what is the situation on VACUUM for release 7.2?
It seems from the pgsql-hackers list that there are plans for
a none-exclusively locking VACUUM, e.g.:(sorry about the long URL); how far advanced are they,
It's long since done.
==>> This means that it will not be included in 7.2? I've read 7.2
documentation on line and i've seen that the VACUUM command is changed:
now, when run in normal mode (giving to the backend the VACUUM command
without any parameter), the tables don't need to be locked and also that
the command does not minimize the space of the database (as instead the
actual 7.1.3 VACUUM does). From the documentation:
-----
Plain VACUUM (without FULL) simply reclaims space and makes it available
for re-use. This form of the command can operate in parallel with normal
reading and writing of the table. VACUUM FULL does more extensive
processing, including moving of tuples across blocks to try to compact the
table to the minimum number of disk blocks. This form is much slower and
requires an exclusive lock on each table while it is being processed.
-----
This way to do is similar (but not equal) to the LAZY VACUUM specified by
Tom Lane in the above link...
In conclusion, the new VACUUM command as described above will be include in
the 7.2 version of Postgresql?
Denis Gasparin: denis@edistar.com
---------------------------
Programmer & System Administrator - Edistar srl
Denis Gasparin <denis@edinet.it> writes:
This way to do is similar (but not equal) to the LAZY VACUUM specified by
Tom Lane in the above link...
In conclusion, the new VACUUM command as described above will be include in
the 7.2 version of Postgresql?
The on-line docs are accurate (at least the VACUUM reference page is;
I don't think I've updated all of the Admin Guide discussions yet).
The link quoted earlier was just an early proposal about what it should
look like, not a final spec.
regards, tom lane
Ok, little language qui pro quo...
I'm sorry for the error...
Denis
At 15/10/01 17.00, Andrew Sullivan wrote:
Show quoted text
On Mon, Oct 15, 2001 at 10:40:17AM +0200, Denis Gasparin wrote:
It's long since done.
==>> This means that it will not be included in 7.2? I've read 7.2
No, it means it _will_ be included.
A
-- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.com> M6K 3E3 +1 416 646 3304 x110
Import Notes
Reply to msg id not found: 20011015110010.A17770@mail.libertyrms.com
On Mon, Oct 15, 2001 at 05:06:05PM +0200, Denis Gasparin wrote:
Ok, little language qui pro quo...
I'm sorry for the error...
No need to be sorry! Just thought I'd clarify.
--
----
Andrew Sullivan 87 Mowat Avenue
Liberty RMS Toronto, Ontario Canada
<andrew@libertyrms.com> M6K 3E3
+1 416 646 3304 x110
On Fri, 12 Oct 2001, Joel Burton wrote:
You can download the nightly snapsot and build
Sorry, I havn�t found the nightly snapshot. Any URL for people like
me who have trouble to use CVS due to firewall constraints?
Kind regards
Andreas.
Try:
http://www.ca.postgresql.org/ftpsite/dev/
Show quoted text
-----Original Message-----
From: Tille, Andreas [SMTP:TilleA@rki.de]
Sent: Tuesday, October 16, 2001 10:45 AM
Cc: PostgreSQL General
Subject: Re: [GENERAL] VACUUM, 24/7 availability and 7.2On Fri, 12 Oct 2001, Joel Burton wrote:
You can download the nightly snapsot and build
Sorry, I havn�t found the nightly snapshot. Any URL for people like
me who have trouble to use CVS due to firewall constraints?Kind regards
Andreas.
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
Import Notes
Resolved by subject fallback