Vacuum DB in Postgres Vs similar concept in other RDBMS

Started by Harpreet Dhaliwalalmost 19 years ago12 messagesgeneral
Jump to latest
#1Harpreet Dhaliwal
harpreet.dhaliwal01@gmail.com

Hi,

I was wondering if Vacuum DB concept in Postgres is really novel and there's
no concept like this in other RDBMS like oracle or sql server.
If at all other RDBMS have such a concept implemented, how good or bad it is
as compared to postgres's vacuum db concept.

Any type of pointers would be highly appreciated.

Thanks,
~Harpreet

#2Alexander Staubo
alex@purefiction.net
In reply to: Harpreet Dhaliwal (#1)
Re: Vacuum DB in Postgres Vs similar concept in other RDBMS

On 5/23/07, Harpreet Dhaliwal <harpreet.dhaliwal01@gmail.com> wrote:

I was wondering if Vacuum DB concept in Postgres is really novel and there's
no concept like this in other RDBMS like oracle or sql server.
If at all other RDBMS have such a concept implemented, how good or bad it is
as compared to postgres's vacuum db concept.

As we told you the last time you asked
(http://archives.postgresql.org/pgsql-general/2007-05/msg00074.php),
the concept of vacuuming is not unique to PostgreSQL by far. It is an
inherent facet of MVCC.

Other databases that implement MVCC, or implement an MVCC-like system
that requires garbage collection, include Oracle, SQLite, Firebird and
its parent project InterBase. Wikipedia has a decent article on MVCC:

http://en.wikipedia.org/wiki/Multiversion_concurrency_control

Alexander.

#3André Volpato
andre.volpato@ecomtecnologia.com.br
In reply to: Alexander Staubo (#2)
Remove query results from cache

Hello,

Is there any way to remove the results of certain query, from the memory
cache ?
I´m doing some performance tests, and I need the planner to make his
work every time I run the statements, without changing them.

Running vmstat, I can se the memory cache grows, and the planner do not
'forget' the results of any query until the cache reach 2 Gb (total box
RAM) , or the server is rebooted.

--
[]´s,

André Volpato
ECOM Tecnologia Ltda
andre.volpato@ecomtecnologia.com.br
(41) 3014 2322

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: André Volpato (#3)
Re: Remove query results from cache

Andr� Volpato wrote:

Hello,

Is there any way to remove the results of certain query, from the memory
cache ?
I�m doing some performance tests, and I need the planner to make his
work every time I run the statements, without changing them.

Running vmstat, I can se the memory cache grows, and the planner do not
'forget' the results of any query until the cache reach 2 Gb (total box
RAM) , or the server is rebooted.

Stop postmaster, unmount the filesystem, mount, restart postmaster. The
problem is not only Postgres' own cache, but the kernel cache as well,
which is why you need the unmount step.

Maybe remounting is good enough, but I'm not sure
mount -o remount /where/lies/data

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#5Dann Corbit
DCorbit@connx.com
In reply to: Harpreet Dhaliwal (#1)
Re: Vacuum DB in Postgres Vs similar concept in other RDBMS

In SQL*Server it is called "UPDATE STATISTICS"

http://msdn2.microsoft.com/en-us/library/ms187348.aspx

Oracle tuning is a lot more fiddly:

http://searchoracle.techtarget.com/originalContent/0,289142,sid41_gci121
3646,00.html

________________________________

From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Harpreet
Dhaliwal
Sent: Wednesday, May 23, 2007 10:49 AM
To: Postgres General
Subject: [GENERAL] Vacuum DB in Postgres Vs similar concept in other
RDBMS

Hi,

I was wondering if Vacuum DB concept in Postgres is really novel and
there's no concept like this in other RDBMS like oracle or sql server.
If at all other RDBMS have such a concept implemented, how good or bad
it is as compared to postgres's vacuum db concept.

Any type of pointers would be highly appreciated.

Thanks,
~Harpreet

#6Alexander Staubo
alex@purefiction.net
In reply to: Dann Corbit (#5)
Re: Vacuum DB in Postgres Vs similar concept in other RDBMS

On 5/23/07, Dann Corbit <DCorbit@connx.com> wrote:

In SQL*Server it is called "UPDATE STATISTICS"

http://msdn2.microsoft.com/en-us/library/ms187348.aspx

No -- MS SQL Server's "update statistics" is the equivalent of
"analyze", not "vacuum."

Alexander.

#7Harpreet Dhaliwal
harpreet.dhaliwal01@gmail.com
In reply to: Alexander Staubo (#6)
Re: Vacuum DB in Postgres Vs similar concept in other RDBMS

I was just wondering if Vacuum Db in postgresql is somehow superior to the
ones that we have in other RDBMS.

Show quoted text

On 5/23/07, Alexander Staubo <alex@purefiction.net> wrote:

On 5/23/07, Dann Corbit <DCorbit@connx.com> wrote:

In SQL*Server it is called "UPDATE STATISTICS"

http://msdn2.microsoft.com/en-us/library/ms187348.aspx

No -- MS SQL Server's "update statistics" is the equivalent of
"analyze", not "vacuum."

Alexander.

#8Chris Browne
cbbrowne@acm.org
In reply to: Harpreet Dhaliwal (#1)
Re: Vacuum DB in Postgres Vs similar concept in other RDBMS

harpreet.dhaliwal01@gmail.com ("Harpreet Dhaliwal") writes:

I was just wondering if Vacuum Db in postgresql is somehow superior
to the ones that we have in other RDBMS.

The thing that is more akin to VACUUM, in Oracle's case, is the
rollback segment. In Oracle, Rollback segments are areas in your
database which are used to temporarily save the previous values when
some updates are going on.

In the case of Oracle, if a transaction rolls back, it has to go and
do some work to clean up after the dead transaction.

This is not *exactly* like PostgreSQL's notion of vacuuming, but
that's the nearest equivalent that Oracle has.

The Oracle InnoDB product also has the notion of rollback segments; if
you use InnoDB tables with MySQL, the rollback functionality has much
the same behaviour as Oracle.

Note that in the case of PostgreSQL, the MVCC behaviour (which
requires VACUUMing) has the merit that COMMIT and ROLLBACK both have
near-zero costs; in either case, the cost is merely to mark the
transaction as either committed or failed. Data doesn't have to be
touched at time of COMMIT/ROLLBACK; any costs that need to be paid are
deferred to VACUUM time.
--
select 'cbbrowne' || '@' || 'acm.org';
http://www3.sympatico.ca/cbbrowne/postgresql.html
"For be a man's intellectual superiority what it will, it can never
assume the practical, available supremacy over other men, without the
aid of some sort of external arts and entrenchments, always, in
themselves, more or less paltry and base. This it is, that forever
keeps God's true princes of the Empire from the world's hustings; and
leaves the highest honors that this air can give, to those men who
become famous more through their infinite inferiority to the choice
hidden handful of the Divine Inert, than through their undoubted
superiority over the dead level of the mass." --Moby Dick, Ch 33

#9Ron Johnson
ron.l.johnson@cox.net
In reply to: Chris Browne (#8)
Re: Vacuum DB in Postgres Vs similar concept in other RDBMS

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 05/23/07 19:17, Chris Browne wrote:

harpreet.dhaliwal01@gmail.com ("Harpreet Dhaliwal") writes:

I was just wondering if Vacuum Db in postgresql is somehow superior
to the ones that we have in other RDBMS.

The thing that is more akin to VACUUM, in Oracle's case, is the
rollback segment. In Oracle, Rollback segments are areas in your
database which are used to temporarily save the previous values when
some updates are going on.

In the case of Oracle, if a transaction rolls back, it has to go and
do some work to clean up after the dead transaction.

This is not *exactly* like PostgreSQL's notion of vacuuming, but
that's the nearest equivalent that Oracle has.

That's the only other way to do it, no?

(Rdb/VMS has dynamically-created [made when a process attaches to
the db] Recovery Unit Journal files that store the record before-
images.)

The Oracle InnoDB product also has the notion of rollback segments; if
you use InnoDB tables with MySQL, the rollback functionality has much
the same behaviour as Oracle.

Note that in the case of PostgreSQL, the MVCC behaviour (which
requires VACUUMing) has the merit that COMMIT and ROLLBACK both have
near-zero costs; in either case, the cost is merely to mark the
transaction as either committed or failed. Data doesn't have to be
touched at time of COMMIT/ROLLBACK; any costs that need to be paid are
deferred to VACUUM time.

So it's not "near-zero cost", it's "deferred cost".

- --
Ron Johnson, Jr.
Jefferson LA USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGVN1mS9HxQb37XmcRAsNbAJ9hgkDpUQGVR1yxb2WrpP/m3U36eQCghv7d
9FWyD8TbSOxXiaa0e8lK5/4=
=W63C
-----END PGP SIGNATURE-----

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ron Johnson (#9)
Re: Vacuum DB in Postgres Vs similar concept in other RDBMS

Ron Johnson <ron.l.johnson@cox.net> writes:

harpreet.dhaliwal01@gmail.com ("Harpreet Dhaliwal") writes:

I was just wondering if Vacuum Db in postgresql is somehow superior
to the ones that we have in other RDBMS.

So it's not "near-zero cost", it's "deferred cost".

Exactly. VACUUM sucks (ahem) in all ways but one: it pushes the
maintenance costs associated with MVCC out of the foreground query code
paths and into an asynchronous cleanup task. AFAIK we are the only DBMS
that does it that way. Personally I believe it's a fundamentally
superior approach --- because when you are under peak load you can defer
the cleanup work --- but you do need to pay attention to make sure that
the async cleanup isn't postponed too long. We're still fooling around
with autovacuum and related tuning issues to make it work painlessly...

regards, tom lane

#11Robert Treat
xzilla@users.sourceforge.net
In reply to: Ron Johnson (#9)
Re: Vacuum DB in Postgres Vs similar concept in other RDBMS

On Wednesday 23 May 2007 20:33, Ron Johnson wrote:

On 05/23/07 19:17, Chris Browne wrote:

harpreet.dhaliwal01@gmail.com ("Harpreet Dhaliwal") writes:

I was just wondering if Vacuum Db in postgresql is somehow superior
to the ones that we have in other RDBMS.

The thing that is more akin to VACUUM, in Oracle's case, is the
rollback segment. In Oracle, Rollback segments are areas in your
database which are used to temporarily save the previous values when
some updates are going on.

In the case of Oracle, if a transaction rolls back, it has to go and
do some work to clean up after the dead transaction.

This is not *exactly* like PostgreSQL's notion of vacuuming, but
that's the nearest equivalent that Oracle has.

That's the only other way to do it, no?

You can also take care of the maintenence part both inline (as opposed to a
seperate segment) and at commit time (rather than delay for a vacuum). See
the current HOT patch for a similar implementation to this idea.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

#12Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Tom Lane (#10)
Re: Vacuum DB in Postgres Vs similar concept in other RDBMS

Tom Lane wrote:

Exactly. VACUUM sucks (ahem) in all ways but one: it pushes the
maintenance costs associated with MVCC out of the foreground query code
paths and into an asynchronous cleanup task. AFAIK we are the only DBMS
that does it that way. Personally I believe it's a fundamentally
superior approach --- because when you are under peak load you can defer
the cleanup work --- but you do need to pay attention to make sure that
the async cleanup isn't postponed too long. We're still fooling around
with autovacuum and related tuning issues to make it work painlessly...

Should this paragraph be added to the FAQ here?
http://www.postgresql.org/docs/faqs.FAQ.html